blog coding out loud https://ferjm.github.io/ Thu, 18 Nov 2021 12:04:18 +0000 Thu, 18 Nov 2021 12:04:18 +0000 Jekyll v3.9.0 opentok-rs: easy WebRTC with Rust <p><a href="https://www.vonage.com/communications-apis/video/">OpenTok</a> is <a href="https://www.vonage.com">Vonage</a>’s (formerly <a href="https://en.wikipedia.org/wiki/TokBox">TokBox’s</a>) <a href="https://en.wikipedia.org/wiki/Platform_as_a_service">PaaS</a> (Platform as a Service) that enables developers to easily build custom video experiences within any mobile, web, or desktop application, on top of a <a href="https://en.wikipedia.org/wiki/WebRTC">WebRTC</a> stack.</p> <p>One of the customer projects that I am working on at <a href="https://igalia.com">Igalia</a> requires publishing and subscribing to streams to and from OpenTok sessions. The main application of this project needs to run on a Linux box and Vonage already provides a nice <a href="https://tokbox.com/developer/sdks/linux/">OpenTok C++ SDK for Linux</a>. However, the entire application for this customer project is written in Rust so, together with my colleague <a href="https://base-art.net/">Philippe Normand</a>, we decided to write Rust bindings for the OpenTok C++ SDK.</p> <p><a href="https://github.com/ferjm/opentok-rs">opentok-rs</a> contains the result of this work. There you can find the <a href="https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html#calling-rust-functions-from-other-languages">FFI</a> bindings, mostly generated with <a href="https://rust-lang.github.io/rust-bindgen/">bindgen</a>, and the safe wrapper API.</p> <p>We recently published a first version in <a href="https://crates.io/crates/opentok">crates.io</a>.</p> <p>There is really not much documentation yet, apart from the <a href="https://doc.rust-lang.org/rustdoc/what-is-rustdoc.html">rustdoc</a> published <a href="https://ferjm.github.io/opentok-rs/opentok/">here</a>, that is mostly a copy &amp; paste of the C++ documentation. But there are a few <a href="https://github.com/ferjm/opentok-rs/tree/main/examples/src/bin">examples</a> that demonstrate how easy and fast you can write your own custom video experiences.</p> <h1 id="basic-video-chat-application">Basic video chat application</h1> <p>With opentok-rs you can write a very basic video chat application like <a href="https://github.com/ferjm/opentok-rs/blob/09ac4d8f38dcb5aa443308a2f9e82444530e745e/examples/src/bin/basic_video_chat.rs">this one</a> in only a few dozen lines of code.</p> <div style="text-align:center;"> <iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/i48iw0GYgcA" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe> </div> <p>If you are not familiar with the basic concepts of OpenTok, I recommend reading the <a href="https://tokbox.com/developer/guides/basics/">official documentation</a> at Vonage’s developer site.</p> <p>In a nutshell, all OpenTok activity occurs within a session, which is somewhat like a “room” where clients interact with one another in real-time. Each participant in a session can publish streams to the session or subscribe to other participants’ streams.</p> <p>To connect to OpenTok sessions you need its identifier and a token. For testing purposes, you can obtain a session ID and a token from the project page in your <a href="https://tokbox.com/developer/">Vonage Video API</a> account. However, in a production application, you will need to dynamically obtain the session ID and token from a web service that uses one of the <a href="https://tokbox.com/developer/sdks/server/">Vonage Video API server SDKs</a>.</p> <p>For a basic chat application you need to create a <a href="https://ferjm.github.io/opentok-rs/opentok/publisher/struct.Publisher.html">Publisher</a> instance, to publish your video stream, and a <a href="https://ferjm.github.io/opentok-rs/opentok/subscriber/struct.Subscriber.html">Subscriber</a> instance, likely in a different thread, to subscribe to the rest of the streams in the session. Each entity may connect to the session separately.</p> <h3 id="publisher">Publisher</h3> <p>The OpenTok SDK is heavily based on callbacks. Starting with the session, you need to provide a <a href="https://ferjm.github.io/opentok-rs/opentok/session/struct.SessionCallbacks.html">SessionCallbacks</a> instance to the <a href="https://ferjm.github.io/opentok-rs/opentok/session/struct.Session.html">Session</a> constructor. For the sake of simplicity, we only care about the <code class="language-plaintext highlighter-rouge">on_connected</code> and <code class="language-plaintext highlighter-rouge">on_error</code> callbacks in this case.</p> <p>You also need to provide the session credentials. This is the Vonage API key, the session ID and its token.</p> <div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">session_callbacks</span> <span class="o">=</span> <span class="nn">SessionCallbacks</span><span class="p">::</span><span class="nf">builder</span><span class="p">()</span> <span class="nf">.on_connected</span><span class="p">(</span><span class="k">move</span> <span class="p">|</span><span class="n">session</span><span class="p">|</span> <span class="p">{</span> <span class="c">// At this point, we can start publishing</span> <span class="n">session</span><span class="nf">.publish</span><span class="p">(</span><span class="o">&amp;*</span><span class="n">publisher</span><span class="nf">.lock</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">())</span> <span class="p">})</span> <span class="nf">.on_error</span><span class="p">(|</span><span class="mi">_</span><span class="p">,</span> <span class="n">error</span><span class="p">,</span> <span class="mi">_</span><span class="p">|</span> <span class="p">{</span> <span class="nd">eprintln!</span><span class="p">(</span><span class="s">"on_error {:?}"</span><span class="p">,</span> <span class="n">error</span><span class="p">);</span> <span class="p">})</span> <span class="nf">.build</span><span class="p">();</span> <span class="k">let</span> <span class="n">session</span> <span class="o">=</span> <span class="nn">Session</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span> <span class="o">&amp;</span><span class="n">credentials</span><span class="py">.api_key</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">credentials</span><span class="py">.session_id</span><span class="p">,</span> <span class="n">session_callbacks</span><span class="p">,</span> <span class="p">)</span><span class="o">?</span><span class="p">;</span> <span class="n">session</span><span class="nf">.connect</span><span class="p">(</span><span class="o">&amp;</span><span class="n">credentials</span><span class="py">.token</span><span class="p">)</span><span class="o">?</span><span class="p">;</span> </code></pre></div></div> <p>The Publisher constructor gets a <a href="https://ferjm.github.io/opentok-rs/opentok/publisher/struct.PublisherCallbacks.html">PublisherCallbacks</a> instance and optionally a <a href="https://ferjm.github.io/opentok-rs/opentok/video_capturer/struct.VideoCapturer.html">VideoCapturer</a> instance. If you do not provide a custom video capturer, the default one capturing audio and video from your local mic and webcam will be used.</p> <div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">publisher_callbacks</span> <span class="o">=</span> <span class="nn">PublisherCallbacks</span><span class="p">::</span><span class="nf">builder</span><span class="p">()</span> <span class="nf">.on_stream_created</span><span class="p">(</span><span class="k">move</span> <span class="p">|</span><span class="mi">_</span><span class="p">,</span> <span class="n">stream</span><span class="p">|</span> <span class="p">{</span> <span class="nd">println!</span><span class="p">(</span><span class="s">"Publishing stream with ID {}"</span><span class="p">,</span> <span class="n">stream</span><span class="nf">.id</span><span class="p">());</span> <span class="p">})</span> <span class="nf">.on_error</span><span class="p">(|</span><span class="mi">_</span><span class="p">,</span> <span class="n">error</span><span class="p">,</span> <span class="mi">_</span><span class="p">|</span> <span class="p">{</span> <span class="nd">eprintln!</span><span class="p">(</span><span class="s">"on_error {:?}"</span><span class="p">,</span> <span class="n">error</span><span class="p">);</span> <span class="p">})</span> <span class="nf">.build</span><span class="p">();</span> <span class="k">let</span> <span class="n">publisher</span> <span class="o">=</span> <span class="nn">Arc</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Mutex</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Publisher</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span> <span class="s">"publisher"</span> <span class="cm">/* Publisher name */</span><span class="p">,</span> <span class="nb">None</span><span class="p">,</span> <span class="cm">/* Use WebRTC's video capturer */</span><span class="p">,</span> <span class="n">publisher_callbacks</span><span class="p">,</span> <span class="p">)));</span> </code></pre></div></div> <p>The <a href="https://github.com/ferjm/opentok-rs/blob/09ac4d8f38dcb5aa443308a2f9e82444530e745e/utils/src/publisher.rs#L122">basic video chat example</a> demonstrates how to add a custom video capturer. <a href="https://github.com/ferjm/opentok-rs/blob/main/utils/src/capturer.rs#L23">In this case</a>, it uses a <a href="https://gstreamer.freedesktop.org/">GStreamer</a> <a href="https://gstreamer.freedesktop.org/documentation/videotestsrc/index.html?gi-language=c">videotestsrc</a> element to produce test video data. You can use whatever mechanism to produce video that you prefer though.</p> <h3 id="subscriber">Subscriber</h3> <p>The subscriber part is somewhat similar. It needs to connect to the session, providing the credentials and the session callbacks. In this case, the callback that we care about the most is the <code class="language-plaintext highlighter-rouge">on_stream_received</code> callback. Within this callback, you can set the stream on your Subscriber instance and instruct the session to use it.</p> <div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">session_callbacks</span> <span class="o">=</span> <span class="nn">SessionCallbacks</span><span class="p">::</span><span class="nf">builder</span><span class="p">()</span> <span class="nf">.on_stream_received</span><span class="p">(</span><span class="k">move</span> <span class="p">|</span><span class="n">session</span><span class="p">,</span> <span class="n">stream</span><span class="p">|</span> <span class="p">{</span> <span class="k">if</span> <span class="n">subscriber</span><span class="nf">.set_stream</span><span class="p">(</span><span class="n">stream</span><span class="p">)</span><span class="nf">.is_ok</span><span class="p">()</span> <span class="p">{</span> <span class="k">if</span> <span class="k">let</span> <span class="nf">Err</span><span class="p">(</span><span class="n">e</span><span class="p">)</span> <span class="o">=</span> <span class="n">session</span><span class="nf">.subscribe</span><span class="p">(</span><span class="o">&amp;</span><span class="n">subscriber</span><span class="p">)</span> <span class="p">{</span> <span class="nd">eprintln!</span><span class="p">(</span><span class="s">"Could not subscribe to session {:?}"</span><span class="p">,</span> <span class="n">e</span><span class="p">);</span> <span class="p">}</span> <span class="p">}</span> <span class="p">})</span> <span class="nf">.on_error</span><span class="p">(|</span><span class="mi">_</span><span class="p">,</span> <span class="n">error</span><span class="p">,</span> <span class="mi">_</span><span class="p">|</span> <span class="p">{</span> <span class="nd">eprintln!</span><span class="p">(</span><span class="s">"on_error {:?}"</span><span class="p">,</span> <span class="n">error</span><span class="p">);</span> <span class="p">})</span> <span class="nf">.build</span><span class="p">();</span> </code></pre></div></div> <p>The Subscriber gets the <a href="https://ferjm.github.io/opentok-rs/opentok/video_frame/struct.VideoFrame.html">video frames</a> through repeated calls to the <code class="language-plaintext highlighter-rouge">on_render_frame</code> callback.</p> <div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">subscriber_callbacks</span> <span class="o">=</span> <span class="nn">SubscriberCallbacks</span><span class="p">::</span><span class="nf">builder</span><span class="p">()</span> <span class="nf">.on_render_frame</span><span class="p">(</span><span class="k">move</span> <span class="p">|</span><span class="mi">_</span><span class="p">,</span> <span class="n">frame</span><span class="p">|</span> <span class="p">{</span> <span class="k">let</span> <span class="n">width</span> <span class="o">=</span> <span class="n">frame</span><span class="nf">.get_width</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">()</span> <span class="k">as</span> <span class="nb">u32</span><span class="p">;</span> <span class="k">let</span> <span class="n">height</span> <span class="o">=</span> <span class="n">frame</span><span class="nf">.get_height</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">()</span> <span class="k">as</span> <span class="nb">u32</span><span class="p">;</span> <span class="k">let</span> <span class="n">get_plane_size</span> <span class="o">=</span> <span class="p">|</span><span class="n">format</span><span class="p">,</span> <span class="n">width</span><span class="p">:</span> <span class="nb">u32</span><span class="p">,</span> <span class="n">height</span><span class="p">:</span> <span class="nb">u32</span><span class="p">|</span> <span class="k">match</span> <span class="n">format</span> <span class="p">{</span> <span class="nn">FramePlane</span><span class="p">::</span><span class="n">Y</span> <span class="k">=&gt;</span> <span class="n">width</span> <span class="o">*</span> <span class="n">height</span><span class="p">,</span> <span class="nn">FramePlane</span><span class="p">::</span><span class="n">U</span> <span class="p">|</span> <span class="nn">FramePlane</span><span class="p">::</span><span class="n">V</span> <span class="k">=&gt;</span> <span class="p">{</span> <span class="k">let</span> <span class="n">pw</span> <span class="o">=</span> <span class="p">(</span><span class="n">width</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&gt;&gt;</span> <span class="mi">1</span><span class="p">;</span> <span class="k">let</span> <span class="n">ph</span> <span class="o">=</span> <span class="p">(</span><span class="n">height</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&gt;&gt;</span> <span class="mi">1</span><span class="p">;</span> <span class="n">pw</span> <span class="o">*</span> <span class="n">ph</span> <span class="p">}</span> <span class="mi">_</span> <span class="k">=&gt;</span> <span class="nd">unimplemented!</span><span class="p">(),</span> <span class="p">};</span> <span class="k">let</span> <span class="n">offset</span> <span class="o">=</span> <span class="p">[</span> <span class="mi">0</span><span class="p">,</span> <span class="nf">get_plane_size</span><span class="p">(</span><span class="nn">FramePlane</span><span class="p">::</span><span class="n">Y</span><span class="p">,</span> <span class="n">width</span><span class="p">,</span> <span class="n">height</span><span class="p">)</span> <span class="k">as</span> <span class="nb">usize</span><span class="p">,</span> <span class="nf">get_plane_size</span><span class="p">(</span><span class="nn">FramePlane</span><span class="p">::</span><span class="n">Y</span><span class="p">,</span> <span class="n">width</span><span class="p">,</span> <span class="n">height</span><span class="p">)</span> <span class="k">as</span> <span class="nb">usize</span> <span class="o">+</span> <span class="nf">get_plane_size</span><span class="p">(</span><span class="nn">FramePlane</span><span class="p">::</span><span class="n">U</span><span class="p">,</span> <span class="n">width</span><span class="p">,</span> <span class="n">height</span><span class="p">)</span> <span class="k">as</span> <span class="nb">usize</span><span class="p">,</span> <span class="p">];</span> <span class="k">let</span> <span class="n">stride</span> <span class="o">=</span> <span class="p">[</span> <span class="n">frame</span><span class="nf">.get_plane_stride</span><span class="p">(</span><span class="nn">FramePlane</span><span class="p">::</span><span class="n">Y</span><span class="p">)</span><span class="nf">.unwrap</span><span class="p">(),</span> <span class="n">frame</span><span class="nf">.get_plane_stride</span><span class="p">(</span><span class="nn">FramePlane</span><span class="p">::</span><span class="n">U</span><span class="p">)</span><span class="nf">.unwrap</span><span class="p">(),</span> <span class="n">frame</span><span class="nf">.get_plane_stride</span><span class="p">(</span><span class="nn">FramePlane</span><span class="p">::</span><span class="n">V</span><span class="p">)</span><span class="nf">.unwrap</span><span class="p">(),</span> <span class="p">];</span> <span class="n">renderer_</span> <span class="nf">.lock</span><span class="p">()</span> <span class="nf">.unwrap</span><span class="p">()</span> <span class="nf">.as_ref</span><span class="p">()</span> <span class="nf">.unwrap</span><span class="p">()</span> <span class="nf">.push_video_buffer</span><span class="p">(</span> <span class="n">frame</span><span class="nf">.get_buffer</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">(),</span> <span class="n">frame</span><span class="nf">.get_format</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">(),</span> <span class="n">width</span><span class="p">,</span> <span class="n">height</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">offset</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">stride</span><span class="p">,</span> <span class="p">);</span> <span class="p">})</span> <span class="nf">.on_error</span><span class="p">(|</span><span class="mi">_</span><span class="p">,</span> <span class="n">error</span><span class="p">,</span> <span class="mi">_</span><span class="p">|</span> <span class="p">{</span> <span class="nd">eprintln!</span><span class="p">(</span><span class="s">"on_error {:?}"</span><span class="p">,</span> <span class="n">error</span><span class="p">);</span> <span class="p">})</span> <span class="nf">.build</span><span class="p">();</span> </code></pre></div></div> <p>The snippet above uses a <a href="https://github.com/ferjm/opentok-rs/blob/main/utils/src/renderer.rs">video renderer</a> based on the GStreamer <a href="https://gstreamer.freedesktop.org/documentation/autodetect/autovideosink.html?gi-language=c#autovideosink-page">autovideosink</a> element. But just like with the custom video capturer, you can use whatever you like to render your video frames.</p> <h3 id="audio">Audio</h3> <p>The OpenTok SDK handles audio and video in different ways. While video streams are independently tied to each publisher and each subscriber in a session, audio is tied to a global <a href="https://ferjm.github.io/opentok-rs/opentok/audio_device/struct.AudioDevice.html">audio device</a> that is shared by all publishers and subscribers.</p> <p>This design imposes two hard limitations:</p> <ul> <li> <p>There is no way to obtain the independent audio stream from each participant. OpenTok provides a single audio stream which is a mix of every participant’s audio, so there is no way to do things like speech-to-text, moderation or any kind of audio processing per participant, unless you create a somewhat <a href="https://github.com/opentok/opentok-linux-sdk-samples/issues/25#issuecomment-916155032">complex workaround</a> where you run each audio subscriber in its own dedicated process.</p> </li> <li> <p>It is not possible to run two instances of the OpenTok SDK in the same process. A second instance of the OpenTok SDK overwrites the audio callbacks set from the previous instance.</p> </li> </ul> <p>Vonage claimed to be working on improving this design.</p> <h1 id="there-is-more">There is more</h1> <p>Everything in opentok-rs is meant to run on client applications, but as mentioned before, Vonage also provides <a href="https://tokbox.com/developer/sdks/server/">server side OpenTok SDKs</a>.</p> <p><a href="https://github.com/ferjm/opentok-server-rs">opentok-server-rs</a> wraps a minimal subset of the OpenTok REST API. It lets developers to securely create sessions and generate tokens for their OpenTok applications.</p> <p>I started it only to be able to write automatic tests for opentok-rs, so the functionality is limited and will hopefully be extended soon.</p> <h1 id="acknowledgements">Acknowledgements</h1> <ul> <li>I would like to thanks <a href="https://www.televic-conference.com/en">Televic Conference</a> for sponsoring this work.</li> <li>Huge thanks to <a href="https://github.com/joliveraortega">José Antonio Olivera</a> from Vonage for his continuous guidance and support while writting the bindings.</li> </ul> Wed, 17 Nov 2021 00:00:00 +0000 https://ferjm.github.io/rust/opentok/webrtc/2021/11/17/opentok-rs.html https://ferjm.github.io/rust/opentok/webrtc/2021/11/17/opentok-rs.html rust opentok webrtc gst-dots: live view of GStreamer pipelines <p>These days I spend a lot of time dealing with large dynamic <a href="https://gstreamer.freedesktop.org/">GStreamer</a> pipelines. More often than not, I find myself stuck in problems that take some careful analysis of the endless stream of debug logs that GStreamer produces. In these situations, taking a look at how the pipelines of the application look like really helps me with the debugging process. To get this information, <a href="https://gstreamer.freedesktop.org/documentation/tutorials/basic/debugging-tools.html?gi-language=c#getting-pipeline-graphs">GStreamer has the capability of outputing graph files</a> that describe the topology of your pipelines. The information that you get is really well presented, but the process of getting it can be a bit cumbersome when you have to do it over and over. The output files are <code class="language-plaintext highlighter-rouge">.dot</code> files that require programs like <a href="http://www.graphviz.org/">GraphViz</a> to get a displayable version of the graph. Many GStreamer developers end up writing scripts or creating their own tools to ease this process. My version of this kind of tool is <a href="https://github.com/ferjm/gst-dots">gst-dots</a>, an extremely simple NodeJS server that watches for GStreamer <code class="language-plaintext highlighter-rouge">.dot</code> files in the path defined by the <code class="language-plaintext highlighter-rouge">GST_DEBUG_DUMP_DOT_DIR</code> environment variable, convert them into SVG images and displays them in a browser with live reload.</p> <p>This is how it looks in action.</p> <div style="text-align:center;"> <video controls="" src="/content/videos/2021/11/gst_dots.mp4"></video> </div> Thu, 11 Nov 2021 00:00:00 +0000 https://ferjm.github.io/gstreamer/2021/11/11/gst-dots.html https://ferjm.github.io/gstreamer/2021/11/11/gst-dots.html gstreamer 2021 WebKit Contributors Meeting talk - WPE Android <p>A couple of weeks ago I attended my first <a href="https://webkit.org/meeting">WebKit Contributors Meeting</a> and I presented this talk about WPE WebKit for Android.</p> <div style="text-align:center;"> <iframe width="560" height="315" src="https://www.youtube.com/embed/h5V-ZLE97-I" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe> </div> Fri, 15 Oct 2021 00:00:00 +0000 https://ferjm.github.io/wpe/webkit/android/talk/2021/10/15/wpe-webkit-android-talk.html https://ferjm.github.io/wpe/webkit/android/talk/2021/10/15/wpe-webkit-android-talk.html wpe webkit android talk WPE WebKit for Android <p>WPE WebKit is the official WebKit port for embedded and low-consumption computer devices. It has been designed from the ground-up with performance, small footprint, accelerated content rendering, and simplicity of deployment in mind.</p> <p>It brings the excellence of the WebKit engine to countless platforms and target devices, serving as a base for systems and environments that primarily or completely rely on web platform technologies to build their interfaces.</p> <p>WPE WebKit’s architecture allows for inclusion in a variety of use cases and applications. It can be custom embedded into an existing application, or it can run as a standalone web runtime under a variety of presentation systems, from platform-specific display managers to existing window management protocols like Wayland or X11.</p> <p>Today, we (<a href="https://igalia.com">Igalia</a>) are happy to announce initial support of WPE for Android.</p> <p>This effort was initiated back in 2017 by my colleague <a href="https://www.igalia.com/igalian/zdobersek">Žan Doberšek</a>, who fully implemented a <a href="https://github.com/Igalia/WPEBackend-android">WPE backend</a> for Android along with the required pieces to get rendering and basic input work. The work was paused for quite some time until the beginning of this year, when I joined Igalia and took over his work. Since then, I have been heads down working on it, trying to make it more usable thanks to <a href="https://github.com/Igalia/cerbero/tree/wpe-android">Cerbero</a> and a <a href="https://developer.android.com/reference/android/webkit/WebView">WebView</a> based Java API.</p> <h1 id="how-it-looks">How it looks</h1> <p>A picture is worth a thousand words. This is how it currently looks running on an Android phone:</p> <div style="text-align:center;"> <video controls="" src="/content/videos/2021/05/wpeandroid_may.mp4"></video> </div> <p>As you can see, we have the basic set of functionality enough to implement a simple multi-tabs web browser with progress report, navigation controls and IME support.</p> <p>Support is not limited to mobile devices though. Thanks to the wide range of architectures and devices that support Android we can now run WPE WebKit on an even wider set of devices. Like a pair of XR glasses. This is a video of a port of <a href="https://mixedreality.mozilla.org/firefox-reality/">Firefox Reality</a> using <a href="https://github.com/Igalia/wpe-android#wpeview-api">WPEView</a> instead of <a href="https://mozilla.github.io/geckoview/">GeckoView</a>:</p> <div style="text-align:center;"> <video controls="" src="/content/videos/2021/05/wpeandroid_fxr.mp4"></video> </div> <h2 id="building-blocks">Building blocks</h2> <h1 id="cerbero-build-system">Cerbero build system</h1> <p>WPE WebKit has a very long list of dependencies. Cross compiling all these dependencies manually can be quite cumbersome, so in order to ease the development process I focused my first weeks of work on setting up a more usable build system. We decided to use <a href="https://github.com/Igalia/cerbero/tree/wpe-android">Cerbero</a>, GStreamer’s cross compilation system, which already had <em>recipes</em> - this is how Cerbero names its build scripts - for many of the required dependencies. I wrote all the missing Cerbero recipes and integrated it into WPE Android’s build system, to the point that building everything requires a single <code class="language-plaintext highlighter-rouge">python3 scripts/bootstrap.py --build</code> command.</p> <p>For now the only supported architecture is arm64. There are plans to support other architectures soon.</p> <h1 id="wpeview-api">WPEView API</h1> <p>WPEView wraps the WPE WebKit browser engine in a reusable Android API. WPEView serves a similar purpose to Android’s built-in WebView and tries to mimick its API aiming to be an easy to use drop-in replacement with extended functionality.</p> <p>Setting up WPEView in your Android application is fairly simple.</p> <p>First, add the WPEView widget to your <a href="https://developer.android.com/training/basics/firstapp/building-ui">Activity layout</a></p> <div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;com.wpe.wpeview.WPEView</span> <span class="na">android:id=</span><span class="s">"@+id/wpe_view"</span> <span class="na">android:layout_width=</span><span class="s">"match_parent"</span> <span class="na">android:layout_height=</span><span class="s">"match_parent"</span> <span class="na">tools:context=</span><span class="s">".MainActivity"</span><span class="nt">/&gt;</span> </code></pre></div></div> <p>And next, wire it in your Activity implementation to start using the API, for example, to load an URL:</p> <div class="language-kotlin highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">override</span> <span class="k">fun</span> <span class="nf">onCreate</span><span class="p">(</span><span class="n">savedInstanceState</span><span class="p">:</span> <span class="nc">Bundle</span><span class="p">?)</span> <span class="p">{</span> <span class="k">super</span><span class="p">.</span><span class="nf">onCreate</span><span class="p">(</span><span class="n">savedInstanceState</span><span class="p">)</span> <span class="nf">setContentView</span><span class="p">(</span><span class="nc">R</span><span class="p">.</span><span class="n">layout</span><span class="p">.</span><span class="n">activity_main</span><span class="p">)</span> <span class="kd">var</span> <span class="py">browser</span> <span class="p">=</span> <span class="nf">findViewById</span><span class="p">(</span><span class="nc">R</span><span class="p">.</span><span class="n">id</span><span class="p">.</span><span class="n">wpe_view</span><span class="p">)</span> <span class="n">browser</span><span class="o">?.</span><span class="nf">loadUrl</span><span class="p">(</span><span class="nc">INITIAL_URL</span><span class="p">)</span> <span class="p">}</span> </code></pre></div></div> <p>To get a better sense on how to use WPEView, check the code of the <em>MiniBrowser</em> demo in the <a href="https://github.com/Igalia/wpe-android/tree/main/examples/minibrowser">examples</a> folder.</p> <h1 id="process-model">Process model</h1> <p>In order to safeguard the rest of the system and to allow the application to remain responsive even if the user loads a web page that infinite loops or otherwise hangs, the modern incarnation of WebKit uses a multi-process architecture. Web pages are loaded in its own WebProcess. Multiple WebProcesses can share a browsing session, which lives in a shared NetworkProcess. In addition to handling all network accesses, this process is also responsible for managing the disk cache and Web APIs that allow websites to store structured data such as Web Storage API and IndexedDB API.</p> <p>Given that Android forbids the fork syscall on non-rooted devices, we cannot directly spawn child processes. Instead, we use Android Services to host the logic of WebKit’s auxiliary processes. The life cycle of all WebKit’s auxiliary processes is managed by WebKit itself. The Android layer only proxies requests to spawn and terminate these processes/services.</p> <p>In addition to the multi-process architecture, modern WebKit versions introduce the PSON model (Process Swap On Navigation) which aims to improve security by creating an independent WebProcess for each security origin. This is currently disabled for WPE Android, although partial support is already in place.</p> <h1 id="browser-and-pages">Browser and Pages</h1> <p>The central piece of WPE Android is the <a href="https://github.com/Igalia/wpe-android/blob/main/wpe/src/main/java/com/wpe/wpe/Browser.java">Browser</a> top level Singleton object. This is somehow the equivalent to WebKit’s UIProcess. Among other duties it:</p> <ul> <li>Manages the creation and destruction of <code class="language-plaintext highlighter-rouge">Page</code> instances.</li> <li>Funnels <code class="language-plaintext highlighter-rouge">WPEView</code> API calls to the appropriate <code class="language-plaintext highlighter-rouge">Page</code> instance.</li> <li>Manages the Android Services equivalent to WebKit’s auxiliary processes (Web and Network processes).</li> <li>Hosts the UIProcess thread where the <a href="https://wpewebkit.org/reference/wpewebkit/2.23.90/WebKitWebContext.html">WebKitWebContext</a> instance lives and where the main loop is run.</li> </ul> <p>A <a href="https://github.com/Igalia/wpe-android/blob/main/wpe/src/main/java/com/wpe/wpe/Page.java">Page</a> roughly corresponds to a tab in a regular browser UI. There is a 1:1 relationship between WPEView and Page. Each Page instance has its own <a href="https://github.com/Igalia/wpe-android/blob/main/wpe/src/main/java/com/wpe/wpe/gfx/View.java">gfx.View</a> and <a href="https://wpewebkit.org/reference/wpewebkit/2.23.90/WebKitWebView.html">WebKitWebView</a> instances associated.</p> <h1 id="wpe-backend">WPE Backend</h1> <p>The common interface between WPEWebKit and its rendering backends is provided by <a href="https://github.com/WebPlatformForEmbedded/libwpe">libwpe</a>. <a href="https://github.com/Igalia/WPEBackend-android">WPEBackend-android</a> is our Android-oriented implementation of the libwpe API, bridging the gap between the WebKit architecture and the internal composition structure on one side and the Android system on the other.</p> <h1 id="gfxview">gfx.View</h1> <p><a href="https://github.com/Igalia/wpe-android/blob/main/wpe/src/main/java/com/wpe/wpe/gfx/View.java">gfx.View</a> is an extension of <a href="https://developer.android.com/reference/android/opengl/GLSurfaceView?hl=en">android.opengl.GLSurfaceView</a> living in the UI Process. It manages the life cycle of a <a href="https://developer.android.com/reference/android/graphics/SurfaceTexture">Surface Texture</a>, which is some sort of buffer consumer, that is handed off to the Web Process through Android’s IPC mechanisms, where the actual rendering happens.</p> <p>It is also in charge of relaying input events to the internal WebKit input-methods.</p> <p>This part is currently being significantly changed by Žan to use <a href="https://developer.android.com/ndk/reference/group/a-hardware-buffer">Native Hardware Buffers</a>.</p> <h2 id="future-work">Future work</h2> <p>There are still plenty of things to do and, we have a growing <a href="https://github.com/Igalia/wpe-android/issues">list of issues</a> in the main repository. The next steps will be towards extending support for other architectures - so far only arm64 is supported. Multimedia support is also on the list of immediate plans. Along with the big rendering engine refactor that Žan is working on.</p> <h2 id="try-it-yourself">Try it yourself</h2> <p>If you want to try the current prototype, you can follow the instructions in the <a href="https://github.com/Igalia/wpe-android/blob/main/README.md#setting-up-your-environment">README</a> of the main repo.</p> <p>We welcome contributions of all kinds. Give it a try and <a href="https://github.com/Igalia/wpe-android/issues/new">file issues</a> as you encounter them. And if you feel encouraged enough, send us patches!</p> <h2 id="acknowledgements">Acknowledgements</h2> <ul> <li>I would like to thank <a href="https://igalia.com">Igalia</a> for giving me the time and space to work on this project.</li> <li>Huge thanks to <a href="https://www.igalia.com/igalian/zdobersek">Žan Doberšek</a> for his amazing work and continuous guidance.</li> <li>Kudos to <a href="https://www.igalia.com/igalian/pnormand">Philippe Normand</a> and <a href="https://www.igalia.com/igalian/tsaunier">Thibault Saunier</a> for their recommendations and support around Cerbero.</li> <li>Many thanks to <a href="https://www.igalia.com/igalian/ifernandez">Imanol Fernández</a> for his contributions so far and for the VR demo.</li> </ul> Mon, 10 May 2021 00:00:00 +0000 https://ferjm.github.io/wpe/webkit/android/2021/05/10/wpe-webkit-android.html https://ferjm.github.io/wpe/webkit/android/2021/05/10/wpe-webkit-android.html wpe webkit android Servo Media Mid-Year review <p>We recently closed the first half of 2019 and with that it is time to look back and do a quick summary of what the media team has achieved during this 6 months period.</p> <p>Looking at some stats, we merged 87 Pull Requests, we opened 56 issues, we closed 42 issues and we welcomed 13 new amazing contributors to the media stack.</p> <h2 id="av-playback">A/V playback</h2> <p>These are some of the selected A/V playback related H1 acomplishments</p> <h4 id="media-cache-and-improved-seeking">Media cache and improved seeking</h4> <p>We significally <a href="https://github.com/servo/servo/pull/22692">improved</a> the seeking experience of audio and video files by implementing preloading and buffering support and a media cache.</p> <div style="text-align:center;"> <iframe title="vimeo-player" src="https://player.vimeo.com/video/311414154" width="640" height="360" frameborder="0" allowfullscreen=""></iframe> </div> <h4 id="basic-media-controls">Basic media controls</h4> <p>After a few months of work we got <a href="https://github.com/servo/servo/pull/22743">partial support for the Shadow DOM API</a>, which gave us the opportunity to implement our first basic set of <a href="https://github.com/servo/servo/pull/23208">media controls</a>.</p> <div style="text-align:center;"> <img src="https://s3.amazonaws.com/media-p.slid.es/uploads/105177/images/6275339/Jun-19-2019_17-11-57.gif" alt="media controls" width="640" /> </div> <p>The UI is not perfect, among other things, because we still have no way to render a progress or volume bar properly, as that depends on the <code class="language-plaintext highlighter-rouge">input type="range"&gt;</code> layout, which so far is rendered as a simple text box instead of the usual slider with a thumb.</p> <h4 id="gstreamer-backend-for-magicleap">GStreamer backend for MagicLeap</h4> <p>Another great achievement by <a href="https://github.com/xclaesse">Xavier Claessens</a> from <a href="https://www.collabora.com/">Collabora</a> has been the GStreamer backend for <a href="https://www.magicleap.com/">Magic Leap</a>. The work is not completely done yet, but as you can see on the animation bellow, he already managed to paint a full screen video on the Magic Leap device.</p> <div style="text-align:center;"> <img src="https://s3.amazonaws.com/media-p.slid.es/uploads/105177/images/6274304/Jun-19-2019_13-12-31.gif" alt="magic leap video" width="640" /> </div> <h4 id="hardware-accelerated-decoding">Hardware accelerated decoding</h4> <p>One of the most wanted features that we have been working on for almost a year and that has recently landed is hardware accelerated decoding.</p> <p>Thanks to the excellent and constant work from the <a href="https://www.igalia.com/">Igalian</a> <a href="https://github.com/ceyusa">Víctor Jáquez</a>, Servo recently gained <a href="https://github.com/servo/servo/pull/23483">support for hardware-accelerated media playback</a>, which means lower CPU usage, better battery life and better thermal behaviour, among other goodies.</p> <p>We only have support on Linux and Android (EGL and Wayland) so far. Support for other platforms is on the roadmap.</p> <div style="text-align:center;"> <video src="https://s3.amazonaws.com/media-p.slid.es/videos/105177/rzteE40V/hwacceleration.mp4" width="640" controls=""></video> </div> <p>The numbers we are getting are already pretty nice. You might not be able to see it clearly on the video, but the renderer CPU time for the non hardware accelerated playback is ~8ms, compared to the ~1ms of CPU time that we get with the accelerated version.</p> <h4 id="improved-web-compatibility-of-our-media-elements-implementation">Improved web compatibility of our media elements implementation</h4> <p>We also got a bunch of other smaller features that significantly improved the web compatibility of our media elements.</p> <ul> <li><a href="https://github.com/servo/ferjm">ferjm</a> <a href="https://github.com/servo/servo/pull/22399">added</a> support for the HTMLMediaElement <code class="language-plaintext highlighter-rouge">poster</code> frame attribute</li> <li><a href="https://github.com/swarnimarun">swarnimarun</a> <a href="https://github.com/servo/servo/pull/23236">implemented</a> support for the HTMLMediaElement <code class="language-plaintext highlighter-rouge">loop</code> attribute</li> <li><a href="https://github.com/jackxbritton">jackxbritton</a> <a href="https://blog.servo.org/2019/07/09/media-update-h1-2019/">implemented</a> the HTMLMediaElement <code class="language-plaintext highlighter-rouge">crossorigin</code> attribute logic.</li> <li>Servo got the ability to <a href="https://github.com/servo/servo/pull/22347">mute and unmute</a> as well as controlling the <a href="https://github.com/servo/servo/pull/22324">volume</a> of audio and video playback thanks to <a href="https://github.com/stevesweetney">stevesweetney</a> and <a href="https://github.com/lucasfantacuci">lucasfantacuci</a>.</li> <li><a href="https://github.com/sreeise">sreeise</a> <a href="https://github.com/servo/servo/pull/22622">implemented</a> the AudioTrack, VideoTrack, AudioTrackList and VideoTrackList interfaces.</li> <li><a href="https://github.com/georgeroman">georgeroman</a> <a href="https://github.com/servo/servo/pull/22449">coded</a> the required changes to allow changing the playback rate of audio and video files.</li> <li><a href="https://github.com/georgeroman">georgeroman</a>, again, <a href="https://github.com/servo/media/pull/232">implemented</a> support for the HTMLMediaElement <code class="language-plaintext highlighter-rouge">canPlayType</code> function.</li> <li><a href="https://github.com/dlrobertson">dlrobertson</a> paved the way for timed text tracks support by implementing the basics of the <a href="https://github.com/servo/servo/pull/22392">TextTrack API</a> and the <a href="https://github.com/servo/servo/pull/22563">HTMLTrackElement interface</a>.</li> </ul> <h1 id="webaudio">WebAudio</h1> <p>We also got a few additions on the WebAudio land.</p> <ul> <li><a href="https://github.com/PurpleHairEngineer">PurpleHairEngineer</a> <a href="https://github.com/servo/media/pull/243">implemented</a> the StereoPannerNode backend.</li> <li><a href="https://github.com/collares">collares</a> <a href="https://github.com/servo/servo/pull/22648">implemented</a> the DOM side of the ChannelSplitterNode.</li> <li><a href="https://github.com/Akhilesh1996">Akhilesh1996</a> <a href="https://github.com/servo/servo/pull/23259">implemented</a> the AudioParam setValueCurveAtTime function.</li> <li><a href="https://github.com/snarasi6">snarasi6</a> <a href="https://github.com/servo/servo/pull/23279">implemented</a> the deprecated setPosition and setOrientation AudioListener methods.</li> </ul> <h1 id="webrtc">WebRTC</h1> <p>Thanks to <a href="https://github.com/jdm">jdm</a>’s and <a href="https://github.com/Manishearth">Manishearth</a>’s work, Servo has now the foundations of a <a href="https://github.com/servo/servo/pull/23377">WebRTC implementation</a> and it is able to perform a 2-way calling with audio and video playback coming from the <a href="https://github.com/servo/servo/pull/22780">getUserMedia API</a>.</p> <div style="text-align:center;"> <iframe src="https://player.vimeo.com/video/328247783" width="640" height="392" frameborder="0" allow="autoplay; fullscreen" allowfullscreen=""></iframe> </div> <h2 id="next-steps">Next steps</h2> <p><em>That’s <strong>not</strong> all folks!</em> We have exciting plans for the second half of 2019.</p> <h1 id="av-playback-1">A/V playback</h1> <p>On the A/V playback land, we want to:</p> <ul> <li>Focus on adding hardware accelerated playback on Windows and OSX.</li> <li>Add support for fullscreen playback.</li> <li>Add support for 360 video.</li> <li>Improve the existing media controls by, for instance, implementing a nicer layout for the <code class="language-plaintext highlighter-rouge">&lt;input type="range"&gt;</code> element, with a proper slider and a thumb, so we can have progress and volume bars.</li> </ul> <h1 id="webaudio-1">WebAudio</h1> <p>For WebAudio there are plans to make some architectural improvements related to the timeline and the graph traversals.</p> <p>We would also love to work on the MediaElementAudioSourceNode implementation.</p> <h1 id="webrtc-1">WebRTC</h1> <p>For WebRTC, data channels are on the roadmap for the second half.</p> <p>We currently support the playback of a single stream of audio and video simultaneously, so allowing the playback of multiple simulatenous streams of each type is also something that we would like to get during the following months.</p> <h1 id="others">Others</h1> <p>There were also plans to implement support for a global mute feature, and I am happy to say, that <a href="https://github.com/khodzha">khodza</a> already <a href="https://github.com/servo/media/pull/271">got this done</a> right at the start of the second half.</p> <p>Finally, we have been trying to get Youtube to work on Servo, but it turned out to be a difficult task because of non-media related issues (i.e. layout or web compatibility issues), so we decided to adjust the goal and focus on embedded Youtube support instead.</p> <p><small>Originally published at <a href="https://blog.servo.org/">https://blog.servo.org/</a></small></p> Tue, 09 Jul 2019 00:00:00 +0000 https://ferjm.github.io/media/2019/07/09/media-update-h1-2019.html https://ferjm.github.io/media/2019/07/09/media-update-h1-2019.html media TIDx 2018 talk - Rust 101 <p>On February 2018 I gave an introductory talk about Rust at the TIDx conference.</p> <div style="text-align:center;"> <iframe width="560" height="315" src="https://www.youtube.com/embed/eLYfMDApTVA" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe> </div> Wed, 28 Feb 2018 00:00:00 +0000 https://ferjm.github.io/rust/talk/2018/02/28/tidx-talk-rust-101.html https://ferjm.github.io/rust/talk/2018/02/28/tidx-talk-rust-101.html rust talk Project Link Networking <p>For the last few months, I have been involved in <a href="https://wiki.mozilla.org/Project_Link">Project Link</a>, one of Mozilla’s <a href="https://wiki.mozilla.org/Connected_Devices">Connected Devices</a> new research projects that aims to create a personal User Agent for the smart homes.</p> <p>We have recently completed our <a href="https://wiki.mozilla.org/Project_Link#Phase_1">first milestone</a> where we managed to prototype a device that is able to communicate with a small set of other different devices through some wireless communication protocols like <a href="https://en.wikipedia.org/wiki/ZigBee">Zigbee</a> and <a href="https://en.wikipedia.org/wiki/Z-Wave">Z-Wave</a> and that exposes an <a href="https://wiki.mozilla.org/Connected_Devices/Projects/Project_Link/Taxonomy#Current_REST_API">HTTP API</a> for clients to get moderated access to these devices through the Link hub. So as today, we are able to setup a Link device in a network where other devices like a set of smart light bulbs, a smart door lock and a motion sensor are connected and we are able to create rules, from inside and outside of that network, to do things like turning off the lights, locking the door and sending a notification when the motion sensor detects that the user leaves her home.</p> <p>Making Link communicate with the different devices through Zigbee or Z-Wave was certainly not an easy task and it required a lot of effort from many members of the team. But it was something that somehow we knew that we could do. In the end, these are known protocols, and even if we had to write a lot of code from scratch because of the choice of technology (<a href="https://www.rust-lang.org/">Rust</a>), there are already a lot of products in the market based on these technologies and a few examples of code that we could take as a starting point for our work.</p> <p>To me, one of the most interesting challenges that we had to face during this initial stage of the project has been how to discover and securely connect to Link (a.k.a <em>the box</em>) from the client side while keeping a decent UX.</p> <p>As Mozillians, <a href="https://www.youtube.com/watch?v=Aw4mTrFW9sU">we believe in the power of the web</a>, so one of our self-imposed initial requirements for this project was that we wanted our <a href="https://github.com/fxbox/app">client demo application</a> to be written entirely with web technologies. We wanted to make this client potentially able to run on any platform with a modern web browser. And there were also other requirements:</p> <ul> <li>This client had to be able to access Link locally, from the same network where the Link device was running on, but also remotely, from outside of that network.</li> <li>The connection between Link and the client had to be securely encrypted in both cases (local and remote access).</li> <li>And both things needed to happen seamlessly and transparently for the user.</li> </ul> <p><a href="http://michielbdejong.com/">Michiel B. de Jong</a> did an excellent research work about the discovery and secure connection area and he proposed a few <a href="https://github.com/fxbox/RFC/issues/3">different solutions</a> to these problems, that included different combinations of cloud, <a href="https://en.wikipedia.org/wiki/QR_code">QR codes</a>, <a href="https://letsencrypt.org/">Let’s Encrypt</a>, <a href="https://en.wikipedia.org/wiki/Multicast_DNS">mDNS</a> and other technologies and protocols.</p> <p>While we do not discard implementing any other of these proposals for the next phases of the project, for the initial prototype we ended up choosing a solution that most part of the team considered that had a good balance between security, privacy and user friendly experience and that could work cross platform and cross browser, taking advantage of the full power of the web.</p> <p><strong>Discovering the box</strong></p> <p>For the discovery part, we implemented the same mechanism that Philips uses to discover their <a href="http://www.developers.meethue.com/documentation/getting-started">Hue Lights Bridge</a>. They call this <em>nUPNP</em> (network UPNP). And it is pretty simple. It requires Link to periodically register itself with a server in the cloud that has a known URL for the client. The data that is stored for this registration is a match between Link’s public and local IP addresses. To get the local address, the client just needs to do a HTTP GET request to the registration server <em>ping</em> endpoint, which should return a JSON object containing this information. This request has to be done from the same network Link is connected to.</p> <p><strong>Securely connecting to the box</strong></p> <p>Unfortunately, we cannot securely connect to local IP addresses through HTTPS. At least not with a proper UX that would not require a terrified user to accept warnings about <a href="https://support.cdn.mozilla.net/media/uploads/gallery/images/2011-10-19-09-09-25-5809bb.jpg">insecure connections</a>, and even in that case (with a self-signed certificate), it would be quite a poor security solution. We needed host names and a trusted <a href="https://en.wikipedia.org/wiki/Certificate_authority">CA</a> for this. And here is where Let’sEncrypt and <a href="https://blog.filippo.io/how-plex-is-doing-https-for-all-its-users/">Plex’s solution</a> enter in the game.</p> <p>We heard about this company called Plex that has a very similar use case as ours and that is offering secure TLS connections to all their users. They have these media servers that users can self-host in their machines and can access to them securely from other devices. You can read about the details of Plex’s implementation in this <a href="https://blog.filippo.io/how-plex-is-doing-https-for-all-its-users/">blog post</a> and see how it slightly defers from ours.</p> <p><strong>Remotely accessing the box (a.k.a tunneling)</strong></p> <p>To provide remote access to Link for those users that choose to have this kind of feature, we initially tried to use <a href="https://ngrok.com/">ngrok</a>, but we found out that they do not support <a href="https://es.wikipedia.org/wiki/Server_Name_Indication">SNI</a> on their open source version. So we ended up moving to <a href="https://pagekite.net/">PageKite</a>, which offers the same core functionality but also provides SNI support.</p> <p><strong>Putting it all together</strong></p> <p>With all the above we ended up implementing the following bootstrap process for Link:</p> <ol> <li>Link exposes HTTP and WebSockets services.</li> <li>First thing that Link does is to generate a self-signed certificate that becomes its identifier.</li> <li>It connects to an <a href="https://github.com/fxbox/dns-server">API</a> on <code class="language-plaintext highlighter-rouge">knilxof.org</code> (our dev server) to create its public DNS zone under <code class="language-plaintext highlighter-rouge">&lt;fingerprint&gt;.knilxof.org</code>, using its self-signed certificate as a client certificate. The API server checks the fingerprint from the DNS zone edit request against the fingerprint of the client certificate presented.</li> <li>Now that the Box has a public DNS zone it can control, it can get a LetsEncrypt certificate, using the <a href="https://letsencrypt.github.io/acme-spec/#rfc.section.7.4">DNS-01 challenge</a>.</li> <li>Link sets its main DNS A record to its current <strong>local</strong> IP address which it obtained via DHCP earlier. It will update this A record whenever its local IP address changes.</li> <li>It also sets two or more mirror A records to its current local IP address. The idea here being that only one of the records will be cached by caching DNS servers, so switching to the other one at the right time will avoid downtime due to DNS propagation delays. This is currently not implemented.</li> <li>If Link is setup to allow remote access, it starts up a PageKite client, which connects to a PageKite frontend, and adds the IP address of the public interface to the PageKite frontend into its DNS zone.</li> <li>With the local, mirrors and tunneled URLs, Link sends a registration request to the nUPNP like <a href="https://github.com/fxbox/registration_server">registration server</a>.</li> </ol> <p>After the above process is completed, when the user browses to our <a href="https://github.com/fxbox/app">client demo application</a>, the app makes a cross-origin request to the registration server <em>ping</em> endpoint to obtain the URLs the app can use to securely connect to Link.</p> <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>GET /ping HTTP/1.1 HTTP/1.1 200 OK Access-Control-Allow-Origin: <span class="k">*</span> Access-Control-Allow-Headers: accept, authorization, content-type Content-Type: application/json<span class="p">;</span> <span class="nv">charset</span><span class="o">=</span>utf-8 Access-Control-Allow-Methods: GET, POST, PUT Content-Length: 312 Date: Fri, 22 Apr 2016 14:39:44 GMT <span class="o">[</span> <span class="o">{</span> <span class="s2">"public_ip"</span>:<span class="s2">"88.xxx.xxx.xxx"</span>, <span class="s2">"client"</span>:<span class="s2">"80a3c3ff0ffc7da455214fe7daaed9216bc4a5a6"</span>, <span class="s2">"message"</span>: <span class="o">{</span> <span class="s2">"local_origin"</span>: <span class="s2">"https://local.80a3c3ff0ffc7da455214fe7daaed9216bc4a5a6.box.knilxof.org:3000"</span>, <span class="s2">"tunnel_origin"</span>:<span class="s2">"https://remote.80a3c3ff0ffc7da455214fe7daaed9216bc4a5a6.box.knilxof.org"</span> <span class="o">}</span>, <span class="s2">"timestamp"</span>:1461335726 <span class="o">}</span> <span class="o">]</span> </code></pre></div></div> <p>The connection to the box is completely seamless for the user as she is never asked to enter a URL or to add any security exception on her browser.</p> <p><img src="/content/images/2016/04/Screen-Shot-2016-04-22-at-5-09-24-PM.png" alt="" /></p> <p><strong>Credits</strong></p> <p>Most part of the design and implementation work has been done by <a href="http://michielbdejong.com/">Michiel B. de Jong</a> and <a href="https://twitter.com/samuelgiles_">Sam Giles</a>.</p> Fri, 22 Apr 2016 00:00:00 +0000 https://ferjm.github.io/mozilla/iot/networking/2016/04/22/project-link-networking.html https://ferjm.github.io/mozilla/iot/networking/2016/04/22/project-link-networking.html mozilla iot networking Improving the Firefox OS Contacts application start-up time <p>One of the biggest challenges that we have in Firefox OS is the <a href="https://developer.mozilla.org/en-US/Apps/Build/Performance/Performance_fundamentals">performance</a>. We have been fighting it since day one and by applying some <a href="https://developer.mozilla.org/en-US/Apps/Build/Performance/Optimizing_startup_performance">different techniques</a> we managed to get to a point where we have some very decent <a href="https://datazilla.mozilla.org/b2g">application start-up time numbers</a>.</p> <p>The last application to get a considerable performance boost has been the Contacts application.</p> <p>During the last few weeks, the Contacts team <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1112551">has been working</a> on a <a href="https://github.com/mozilla-b2g/gaia/commit/f1d0684817e5802961c02a04dcf667cfaf09d6ee">patch</a> that finally landed on master yesterday. The result is an improvement of around 720 milliseconds of <a href="https://developer.mozilla.org/en-US/Apps/Build/Performance/Firefox_OS_app_responsiveness_guidelines#Stages">perceived start-up time</a>, which means that we saved almost 50% of the previous start-up time.</p> <p><a href="https://datazilla.mozilla.org/b2g/?branch=master&amp;device=flame-319MB&amp;range=7&amp;test=startup_%3E_moz-app-visually-complete&amp;app_list=communications/contacts&amp;app=communications/contacts&amp;gaia_rev=9645d45d5777880e&amp;gecko_rev=f6259882882b&amp;plot=median">Datazilla</a> already shows the change.</p> <p><img src="/content/images/2015/03/contactsperfimprovement-1.png" alt="Datazilla changes" /></p> <p><a href="https://github.com/stasm/test-perf-summary">Comparing</a> the results of running the <a href="https://developer.mozilla.org/en-US/Firefox_OS/Platform/Automated_testing/Gaia_performance_tests">Gaia performance tests</a> with a heavy workload before and after the patch we get the following numbers:</p> <table> <thead> <tr> <th style="text-align: center">communications/contacts (means in ms)</th> <th>Base</th> <th>Patch</th> <th>Δ</th> </tr> </thead> <tbody> <tr> <td style="text-align: center">moz-chrome-dom-loaded</td> <td>1147</td> <td>585</td> <td>-562</td> </tr> <tr> <td style="text-align: center">moz-chrome-interactive</td> <td>1267</td> <td>1393</td> <td>126</td> </tr> <tr> <td style="text-align: center">moz-app-visually-complete</td> <td>1601</td> <td>874</td> <td>-727</td> </tr> <tr> <td style="text-align: center">moz-content-interactive</td> <td>2131</td> <td>1393</td> <td>-738</td> </tr> <tr> <td style="text-align: center">moz-app-loaded</td> <td>10942</td> <td>10409</td> <td>-533</td> </tr> </tbody> </table> <p>As you can see we are sending the <code class="language-plaintext highlighter-rouge">moz-app-visually-complete</code> event ~727 milliseconds earlier than before. This is the event that we use to indicate that the application appears visually ready for user interaction and the one that we really want to send as soon as possible. We also get similar improvements for the <code class="language-plaintext highlighter-rouge">moz-chrome-dom-loaded</code>, <code class="language-plaintext highlighter-rouge">moz-content-interactive</code> and <code class="language-plaintext highlighter-rouge">moz-app-loaded</code> events. You can also notice that we had to make some trade offs and we lost some ground with the <code class="language-plaintext highlighter-rouge">moz-chrome-interactive</code> event. If you look closer, you will see that chrome and content are marked as interactive at the same time. This was not happening before. We were not able to interact with the application content until almost one second after being able to interact with the application chrome. Now we have everything ready at once ~700 milliseconds before and given that the most important part of the Contacts application is the contacts data itself, we consider that the small lost in the <code class="language-plaintext highlighter-rouge">moz-chrome-interactive</code> event is worth the result. You can check the <a href="https://developer.mozilla.org/en-US/Apps/Build/Performance/Firefox_OS_app_responsiveness_guidelines#Stages">MDN responsiveness guidelines</a> page for more details about these events.</p> <p>I recorded a quick video comparing the previous situation (left) with the current one (right). (Apologies for the low quality of the recording).</p> <iframe title="vimeo-player" src="https://player.vimeo.com/video/121901924" width="710" height="360" frameborder="0" allowfullscreen=""></iframe> <h2 id="how-did-we-get-there">How did we get there</h2> <p>The target was to have some usable UI ready as soon as possible before the browser painted anything on the screen. For us, this usable UI is the application chrome with the <code class="language-plaintext highlighter-rouge">Add contact</code> and <code class="language-plaintext highlighter-rouge">Settings</code> options and the first chunk of contacts, including favorite and <a href="http://en.wikipedia.org/wiki/In_case_of_emergency">ICE</a> contacts.</p> <p>So far we were not doing bad showing the application chrome, but we were taking extra time to load the first group of visible contacts. To show this first content we needed to do a request to the <a href="https://developer.mozilla.org/en-US/docs/Web/API/Navigator/mozContacts">MozContacts API</a> to obtain the list of stored contacts and start appending to the DOM one new node per each contact information retrieved. The thing is that the result of this request rarely changed from one execution to the other. So why not caching it?</p> <p>We followed the same approach that the Email team already applied on the <a href="https://groups.google.com/forum/#!topic/mozilla.dev.gaia/v_jVuwOJMKI">Email application</a> for caching the email list. We used <a href="https://developer.mozilla.org/en-US/docs/Web/API/Window/localStorage">localStorage</a> to save the result of getting the contacts list from the MozContacts API and rendering the first chunk of contacts. To avoid having to do object serialization and parsing before and after accessing the localStorage item, we initially tried storing the whole <a href="https://developer.mozilla.org/en-US/docs/Web/API/Element/outerHTML">outerHTML</a> string of the <a href="https://github.com/mozilla-b2g/gaia/blob/master/apps/communications/contacts/index.html#L218">contacts groups container</a> holding the first chunk of contacts and applying it via <a href="https://developer.mozilla.org/en-US/docs/Web/API/Element/innerHTML">innerHTML</a>, but that did not give good enough performance and it made the logic for managing the contacts cache harder. Also, in the end we figured out that we needed to store other information like the language direction or the cache date along with the HTML to decide wether the cache was valid or not, so object serialization and parsing was required in any case. Instead of that, we ended up storing an object with this information to ease the cache eviction decision and enough information to rebuild the DOM containing the first chunk of contacts. We applied this data to the DOM via <a href="https://developer.mozilla.org/en-US/docs/Web/API/DocumentFragment">documentFragment</a>. You can checkout the code for <a href="https://github.com/mozilla-b2g/gaia/blob/master/apps/communications/contacts/js/views/list.js#L2274">building</a> and <a href="https://github.com/mozilla-b2g/gaia/blob/master/apps/communications/contacts/js/bootstrap.js#L173">applying</a> the cache.</p> <p>The trickiest part of maintaining this cache is the eviction policy. We need to evict and rebuild the cache every time a contact is changed (added, removed or edited) and because this can happen from inside and from outside of the Contacts app (even when the app is closed), we need to be specially careful and verify the cache after applying it to the DOM without affecting the performance or causing visual reflows. You can follow <a href="https://github.com/mozilla-b2g/gaia/blob/master/apps/communications/contacts/js/views/list.js#L2375">this code</a> to see how we managed to do that. Other scenarios where we need to evict the cache are language direction changes, <a href="http://en.wikipedia.org/wiki/In_case_of_emergency">ICE</a> contacts changes, favorite contacts modifications and when the user changes the way the contacts are displayed (by first or last name).</p> <p>Apart from building the cache mechanism we also changed the application bootstrap process in a way that we only load the minimum set of scripts required to get the cached information from localStorage and to apply it in the DOM. Once we have this process completed, we load the rest of the application Javascript that is required to continue the rest of the boot process. You can see this logic in the new <a href="https://github.com/mozilla-b2g/gaia/blob/master/apps/communications/contacts/js/bootstrap.js#L410">bootstrap</a> script.</p> <h2 id="next-steps">Next steps</h2> <p>We want to keep improving the performance of the Contacts application. The next thing that we want to target is improving the loading of the contacts thumbnails. In fact, <a href="https://twitter.com/mepartoconmigo">Francisco Jordano</a> has already started working on <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=1089538">it</a> and there are already some visible improvements.</p> <iframe width="710" height="381" src="https://www.youtube-nocookie.com/embed/lOx-Ym2qUlM" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe> <p>We also want to experiment with caching most part or even the whole contacts list in different chunks to allow the user to use the alpha scrolling and to get a fully loaded application even sooner.</p> <p>Finally, the Gaia team is starting to play around with <a href="http://www.html5rocks.com/en/tutorials/service-worker/introduction/">Service Workers</a> and with the idea of using this new feature to cache already rendered views in a similar way that we did for Contacts. I cannot wait to see more progress in this area :)</p> <h2 id="credits">Credits</h2> <p><a href="https://twitter.com/mepartoconmigo">Francisco Jordano</a>, <a href="https://github.com/JohanLorenzo">Johan Lorenzo</a>, <a href="http://sergimansilla.com/">Sergi Mansilla</a>.</p> Wed, 11 Mar 2015 00:00:00 +0000 https://ferjm.github.io/firefoxos/performance/2015/03/11/improving-fxos-contacts-application-start-up-time.html https://ferjm.github.io/firefoxos/performance/2015/03/11/improving-fxos-contacts-application-start-up-time.html firefoxos performance Behind the scenes of the new Web Payments API from Mozilla <p>When we started working on <a href="http://blog.digital.telefonica.com/?press-release=telefonica-outlines-launch-plans-for-firefox-os">Firefox OS</a>, we realized that one of the biggest challenges would be enabling web application developers to securely monetise their content, not only for Firefox OS but for the Open Web in general.</p> <p>We were looking for the same seamless experience that developers find in existing mobile app stores but we wanted to avoid tying them to any store or proprietary solution, while also allowing them to use the same payment mechanisms in desktop and mobile. We also had the challenge of easing the user’s payment process by adding carrier billing capabilities, along with other payment methods like credit cards, to this solution.</p> <p>The Mozilla Marketplace team already had an experimental feature based on <a href="https://developers.google.com/commerce/wallet/digital/docs/index">google.payments.inapp.buy</a> to allow developers to add the capability for in-app payments to their apps. However, this solution was tied to the Firefox Marketplace and, as with Google’s solution, it involved the injection of a JS shim in the application code. So even if we liked this approach, we needed to modify it to fulfill our self-imposed requirements.</p> <p>With <a href="http://andreasgal.com/">Andreas Gal’s</a> help, we started writing the first draft of <a href="https://docs.google.com/document/d/1NLKbHVPQXa9uvDBC3cfgOD7sIrtIxi0qDoXMQrxcCsI/edit">navigator.mozPay()</a> with the intention of proposing the first steps for an API that allows Open Web Apps to initiate payment requests from the user for digital goods with multiple payment providers and carrier billing options.</p> <p>Once we had an agreement from both the Telefónica and Mozilla teams, we started implementing it for Firefox OS with valuable support from <a href="https://github.com/fabricedesre">Fabrice Desré</a> and <a href="https://github.com/kumar303">Kumar McMillan</a>. Along with this work, the <a href="http://bluevia.com/">BlueVia</a> and Firefox Marketplace teams also started working on a first implementation of the WebPaymentProvider <a href="https://wiki.mozilla.org/WebAPI/WebPaymentProvider">spec</a>, one after the other, and we started working with <a href="http://bango.com/">Bango</a> to enable them as a payment partner.</p> <p><strong>How it works</strong></p> <p>The navigator.mozPay API allows the developer to create payment requests for different payment providers to charge a user for the purchase of a digital good. In order to create each payment request, the developer needs to create a <a href="http://openid.net/specs/draft-jones-json-web-token-07.html">JSON Web Token (JWT)</a> for each payment provider signed with the Application Secret given by each corresponding provider. This token contains the details of the payment request, including the Application Key, which uniquely identifies the developer and the product being sold.</p> <div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w"> </span><span class="nl">"iss"</span><span class="p">:</span><span class="w"> </span><span class="err">APPLICATION_KEY</span><span class="p">,</span><span class="w"> </span><span class="nl">"aud"</span><span class="p">:</span><span class="w"> </span><span class="s2">"marketplace.firefox.com"</span><span class="p">,</span><span class="w"> </span><span class="err">...</span><span class="w"> </span><span class="nl">"request"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Magical Unicorn"</span><span class="p">,</span><span class="w"> </span><span class="nl">"pricePoint"</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="nl">"postbackURL"</span><span class="p">:</span><span class="w"> </span><span class="s2">"https://yourapp.com/postback"</span><span class="p">,</span><span class="w"> </span><span class="nl">"chargebackURL"</span><span class="p">:</span><span class="w"> </span><span class="s2">"https://yourapp.com/chargeback"</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="p">}</span><span class="w"> </span></code></pre></div></div> <p>Applications using navigator.mozPay asynchronously receive responses about the completion of payment requests through Javascript callbacks and through POST notifications done by the payment provider to the URLs specified by the developer within the JWT request as postbackURL (for payments) and chargebackURL (for refunds) parameters. The application must only rely on the server side notification to determine the result of a purchase.</p> <div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="kd">const</span> <span class="nx">request</span> <span class="o">=</span> <span class="nb">navigator</span><span class="p">.</span><span class="nx">mozPay</span><span class="p">([</span><span class="nx">signedJWT1</span><span class="p">,</span> <span class="nx">signedJWTn</span><span class="p">]);</span> <span class="nx">request</span><span class="p">.</span><span class="nx">onsuccess</span> <span class="o">=</span> <span class="kd">function</span><span class="p">()</span> <span class="p">{</span> <span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="s2">`Payment flow successfully completed </span><span class="p">${</span><span class="nx">evt</span><span class="p">.</span><span class="nx">target</span><span class="p">.</span><span class="nx">result</span><span class="p">}</span><span class="s2">`</span><span class="p">);</span> <span class="c1">// The payment buy flow completed without errors.</span> <span class="c1">// This does NOT mean the payment was successful.</span> <span class="nx">waitForServerPostback</span><span class="p">();</span> <span class="p">};</span> <span class="nx">request</span><span class="p">.</span><span class="nx">onerror</span> <span class="o">=</span> <span class="kd">function</span><span class="p">(</span><span class="nx">evt</span><span class="p">)</span> <span class="p">{</span> <span class="nx">console</span><span class="p">.</span><span class="nx">error</span><span class="p">(</span><span class="s2">`navigator.mozPay() error: </span><span class="p">${</span><span class="nx">evt</span><span class="p">.</span><span class="nx">target</span><span class="p">.</span><span class="nx">errorMsg</span><span class="p">.</span><span class="nx">name</span><span class="p">}</span><span class="s2">`</span><span class="p">);</span> <span class="p">};</span> </code></pre></div></div> <p>For more in-depth documentation read the <a href="https://wiki.mozilla.org/WebAPI/WebPayment">navigator.mozPay()</a> spec and the Firefox Marketplace <a href="https://developer.mozilla.org/en-US/docs/Apps/Publishing/In-app_payments">guide</a> to in-app payments.</p> <p>The Firefox Marketplace itself uses navigator.mozPay() to request payments for application purchases and so it is a good proof of concept of the usage of the WebPayments API.</p> <p><strong>What next?</strong></p> <p>As Mozilla wrote on their <a href="https://hacks.mozilla.org/2013/04/introducing-navigator-mozpay-for-web-payments/">blog</a> last week, navigator.mozPay() is an experimental API and it is just a first step towards an Open Web Standard for payments. We are already working on some improvements like removing the <a href="https://groups.google.com/forum/?fromgroups=#!topic/mozilla.dev.webapps/0vUFHASyWB4">server pre-requisite</a> entirely or a better user experience for the <a href="https://groups.google.com/forum/?fromgroups=#!topic/mozilla.dev.b2g/4-FVBgM577I">payment flow</a> on the payment provider’s side.</p> <p>The plan is to keep working closely with Mozilla and others through the W3C to make a flexible API for payments part of the Open Web Standards.</p> <p>After the launch of the first Firefox OS devices, we will be helping Mozilla to add support for navigator.mozPay() to Firefox desktop and Firefox for Android</p> <p><small>Originally published at <a href="http://en.blogthinkbig.com/2013/04/09/mozilla-web-payments-api/">BlogThinkBig</a></small></p> Tue, 09 Apr 2013 00:00:00 +0000 https://ferjm.github.io/web/payments/api/firefoxos/telefonica/mozilla/2013/04/09/behind-scenes-web-payments-api.html https://ferjm.github.io/web/payments/api/firefoxos/telefonica/mozilla/2013/04/09/behind-scenes-web-payments-api.html web payments api firefoxos telefonica mozilla