JGriffin's Blog

Is this thing on?

Engineering Productivity Update, November 5, 2015

It’s the first week of November, and because of the December all-hands and the end-of-year holidays, this essentially means the quarter is half over. You can see what the team is up to and how we’re tracking against our deliverables with this spreadsheet.

Highlights

hg.mozilla.org: gps did some interesting work investigating ways to increase cloning performance on Windows; it turns out closing files which have been appended is a very expensive process there. He also helped roll out bundle-related cloning improvements in Mercurial 3.6.

Community: jmaher has posted details about our newest Quarter of Contribution. One of our former Outreachy interns, adusca, has blogged about what she gets out of contributing to open source software.

MozReview and Autoland: dminor blogged about the most recent MozReview work week in Toronto. Meanwhile, mcote is busy trying to design a more intuitive way to deal with parent-child review requests. And glob, who is jumping in to help out with MozReview, has created a high-level diagram sketching out MozReview’s primary components and dependencies.

Autoland has been enabled for the version-control-tools repo and is being dogfooded by the team. We hope to have it turned on for landings to mozilla-inbound within a couple of weeks.

Treeherder: the team is in London this week working on the automatic starring project. They should be rolling out an experimental UI soon for feedback from sheriffs and others. armenzg has fixed several issues with automatic backfilling so it should be more useful.

Perfherder: wlach has blogged about recent improvements to Perfherder, including the ability to track the size of the Firefox installer.

Developer Workflows: gbrown has enabled |mach run| to work with Android.

TaskCluster Support: the mochitest-gl job on linux64-debug is now running in TaskCluster side-by-side with buildbot. Work is ongoing to green up other suites in TaskCluster. A few other problems (like failure to upload structured logs) need to be fixed before we can turn off the corresponding buildbot jobs and make the TaskCluster jobs “official”.

e10s Support: we are planning to turn on e10s tests on Windows 7 as they are greened up; the first job which will be added is the e10s version of mochitest-gl, and the next is likely mochitest-devtools-chrome. To help mitigate capacity impacts, we’ve turned off Windows XP tests by default on try in order to allow us to move some machines from the Windows XP pool to the Windows 7 pool, and some machines have already been moved from the Linux 64 pool (which only runs Talos and Android x86 tests) to the Windows 7 pool. Combined with some changes recently made by Releng, Windows wait times are currently not problematic.

WebDriver: ato, jgraham and dburns recently went to Japan to attend W3C TPAC to discuss the WebDriver specification. They will be extending the charter of the working group to get it through to CR. This will mean certain parts of the specification need to finished as soon as possible to start getting feedback.

The Details

hg.mozilla.org

  • Better error messages during SSH failures (bug 1217964)
  • Make pushlog compatible with Mercurial 3.6 (bug 1217569)
  • Support Mercurial 3.6 clone bundles feature on hg.mozilla.org (bug 1216216)
    • Functionality from bundleclone extension that Mozilla wrote and deployed is now a feature in Mercurial itself!
  • Advertise clone bundles feature to 3.6+ clients that don’t have it enabled (bug 1217155)
  • Update the bundleclone extension to seamlessly integrate with now built-in clone bundles feature

MozReview/Autoland

  • We’ve enabled Autoland to “Inbound” for the version-control-tools repository and are dogfooding it while working on UI and workflow improvements.
  • Following up on some discussion around “squashed diffs”, an explanatory note has been added to the parent (squashed) review requests, which serves to distinguish them from, and to promote, review requests for individual commits.
  • “Complete Diff” has been renamed to the more accurate “Squashed Diff”. The “Review Summary” link has been removed, but you can still get to the squashed-diff reviews via the squashed diff itself—but note that we’ll likely be removing support for squashed-diff reviews in order to promote the practices of splitting up large commits into smaller, standalone ones and reviewing each individually.
  • A patch to track the files review status is now under review; it should land in the next few days.

Mobile Automation

  • [gbrown] ‘mach run’ now supports Firefox for Android
  • [bc] Helping out with Autophone Talos, mozdevice adb*.py maintenance

Firefox and Media Automation

  • [maja_zf] Marionette test runner is now a litte more flexible and extensible: I’ve added some features needed by Firefox UI and Update tests that are useful to all desktop tests. (Bug 1212608)

ActiveData

  • [ekyle] Buildbot JSON logs are imported, along with all text logs they point to: we should now have a complete picture of the time spent on all steps by all machines on all pools.   Still verifying the data though.

bugzilla.mozilla.org

  • (bug 1213757) delegate password and 2fa resets to servicedesk
  • (bug 1218457) Allow localconfig to override (force) certain data/params values (needed for AWS)
  • (bug 1219750) Allow Apache2::SizeLimit to be configured via params
  • (bug 1177911) Determine and implement better password requirements for BMO
  • (bug 1196743) – Fix information disclosure vulnerability that allows attacker to obtain victim’s GitHub OAuth return code

Perfherder/Performance Testing

TaskCluster Support

  • [ahal] mochitest-webgl running on linux64 debug on all trunk branches

General Automation

WebDriver

  • We (ato, AutomatedTester, jgraham) went to Japan to W3C TPAC to discuss the WebDriver specification. We will be extending the charter of the working group to get it through to CR. This will mean certain parts of the specification need to finished as soon as possible to start getting feedback.

 

Engineering Productivity Update, Oct 21, 2015

It’s Q4, and at Mozilla that means it’s planning season. There’s a lot of work happening to define a Vision, Strategy and Roadmap for all of the projects that Engineering Productivity is working on; I’ll share progress on that over the next couple of updates.

Higlights

Build System: Work is starting on a comprehensive revamp of the build system, which should make it modern, fast, and flexible. A few bits of this are underway (like migration of remaining Makefiles to moz.build); more substantial progress is being planned for Q1 and the rest of 2016.

Bugzilla: Duo 2FA support is coming soon! The necessary Bugzilla changes has landed, we’re just waiting for some licensing details to be sorted out.

Treeherder: Improvements have been made to the way that sheriffs can backfill jobs in order to bisect a regression. Meanwhile, lots of work continues on backend and frontend support for automatic starring.

Perfherder and Performance Testing: Some optimizations were made to Perfherder which has made it more performant – no one wants a slow performance monitoring dashboard! jmaher and bc are getting close to being able to run Talos on real devices via Autophone; some experimental runs are already showing up on Treeherder.

MozReview and Autoland: It’s no longer necessary to have an LDAP account in order to push commits to MozReview; all that’s needed is a Bugzilla account. This opens the door to contributors using the system. Testing of Autoland is underway on MozReview’s dev instance – expect it to be available in production soon.

TaskCluster Migration: OSX cross-compiled builds are now running in TaskCluster and appearing in Treeherder as Tier-2 jobs, for debug and static checking. The TC static checking build with likely become the official build soon (and the buildbot build retired); the debug build won’t become official until work is done to enable existing test jobs to consume the TC build.

Work is progressing on enabling TaskCluster test jobs for linux64-debug; our goal is to have these all running side-by-side the buildbot jobs this quarter, so we can compare failure rates before turning off the corresponding buildbot jobs in Q1. Moving these jobs to TaskCluster enables us to chunk them to a much greater degree, which will offer some additional flexibility in automation and improve end-to-end times for these tests significantly.

Mobile Automation: All Android test suites that show in Treeherder can now be run easily using mach.

Dev Workflow: It’s now easier to create new web-platform-tests, thanks to a new |mach web-platform-tests-create| command.

e10s Support: web-platform-tests are now running in e10s mode on linux and OSX platforms. We want to turn these and other tests in e10s mode on for Windows, but have hardware capacity problems. Discussions are underway on how to resolve this in the short-term; longer-term plans include an increase in hardware capacity.

Test Harnesses: run-by-dir is now applied to all mochitest jobs on desktop. This improves test isolation and paves the way for chunking changes which we will use to improve end-to-end times and make bisection turnaround faster. Structured logging has been rolled out to Android reftests; Firefox OS reftests still to come.

ActiveData: Work is in progress to build out a model of our test jobs running in CI, so that we can identify pieces of job setup and teardown which are too slow and targets of possible optimization, and so that we can begin to predict the effects of changes to jobs and hardware capacities.

hg.mozilla.org: Mercurial 3.6 will have built-in support for seeding clones from pre-generated bundle files, and will have improved performance for cloning, especially on Windows.

Marionette and WebDriver: Message sequencing is being added to Marionette; this will help prevent synchronization issues where the client mixes up responses. Client-side work is being done in both Python and node.js. ato wrote an article making a case against visibility checks in WebDriver.

Details

bugzilla.mozilla.org

  • bug 1199089 – support for Duo 2FA has landed.  It isn’t available just yet as we’re waiting on the licensing situation to be sorted
  • lots of tweaks to the experimental UI

Treeherder

Perfherder/Performance Testing

MozReview/Autoland

TaskCluster Support

  • Landed all patches for cross-mac builds, running fine on inbound/central!
  • [ahal] Got some linux64 tests running (various flavours of mochitest, reftest and xpcshell), though not yet green.

Mobile Automation

  • [gbrown] mach reftest|crashtest|jstestbrowser now supports Firefox for Android (all Android test suites run on treeherder can now be run from mach)

Dev Workflow

  • [jgraham] Added a |mach web-platform-tests-create| target to help with the workflow of creating new web-platform-tests.

Firefox and Media Automation

  • Netflix bandwidth limiting tests blocked because of a problem on Netflix side.
  • Web platform media-source directory no longer being run on our Jenkins since all platforms of web platform tests now run as part of release.
  • We’ve established a roadmap that coordinates moving ui-tests and media-tests in-tree, updating the Marionette test runner and moving media jobs into mozmill-ci

General Automation

  • [jgraham] web-platform-tests-e10s now running across all trees on Mac/Linux (Windows has capacity problems)
  • SETA updated to support new android debug tests
  • run-by-dir is enabled for all desktop mochitests.
  • [ahal] reftest structured logging working on desktop and android (b2g still left to do)

ActiveData

hg.mozilla.org

  • Mercurial 3.6 will have built-in support for seeding clones from pre-generated, externally-hosted bundle files (i.e. the bundleclone extension)
  • Mercurial 3.6 features significant performance improvements for cloning, especially on Windows.

WebDriver

Marionette

Engineering Productivity Update, Oct 1, 2015

We’ve said good-bye to Q3, and are moving on to Q4. Planning for Q4 goals and deliverables is well underway; I’ll post a link to the final versions next update.

Last week, a group of 8-10 people from Engineering Productivity gathered in Toronto to discuss approaches to several aspects of developer workflow. You can look at the notes we took; next up is articulating a formal Vision and Roadmap for 2016, which incorporates both this work as well as other planning which is ongoing separately for things like MozReview and Treeherder.

Highlights

Bugzilla: Support for 2FA has been enhanced.

Treeherder:

  • The automatic starring backend, along with related database changes, is now in production. In Q4 we’ll be developing a simple UI for this, and by the end of quarter, automatic starring for at least simple failures should be a reality.
  • Treeherder will soon stop posting bug comments for each intermittent failure. Instead OrangeFactor will post periodic summaries on bugs – see: https://groups.google.com/d/msg/mozilla.dev.tree-management/az643p0u4hs/3el7fqIDBwAJ
  • Job Ingestion via Pulse Exchanges is in the final review stages.  This will allow projects like Task Cluster to send JSON Schema-validated job data to Treeherder via a Pulse Exchange, rather than our APIs.  It also enables developers and testers the ability to ingest production jobs from Task Cluster to their local machine.  Blog post: https://cheshirecam.wordpress.com/2015/09/30/treeherder-loading-data-from-pulse/
  • :Goma’s line highlighting and linking in the log viewer are now live. See this blog post for details.
  • Jonathan French, our awesome contractor and contributor, has landed onscreen shortcuts; see this blog post. Jonathan will be moving on to other things soon, and we’ll sorely miss him!

Perfherder and Performance Automation:

  • Work is underway to prototype a UI in Perfherder which can be used for performance sheriffing sans Alert Manager or Graphserver; follow bug 1201154 for more details. Separately, work has been started to allow other performance harnesses (besides Talos) submit data to Perfherder; bug 1175295.
  • Talos on linux32 has been turned off; the machines that had been used for this are being repurposed as Windows 7 and Windows 8 test workers, in order to reduce overall wait times on those platforms.
  • The dromaeo DOM Talos test has been enabled on linux64.

MozReview and Autoland: mcote posted a blog post detailing some of the rough edges in MozReview, and explaining how the team intends on tackling these. dminor blogged about the state of autoland; in short, we’re getting close to rolling out an initial implementation which will work similarly to the current “checkin-needed” mechanism, except, of course, it will be entirely automated. May you never have to worry about closed trees again!

Mobile Automation: gbrown made some additional improvements to mach commands on Android; bc has been busy with a lot of Autophone fixes and enhancements.

Firefox Automation: maja_zf has enabled MSE playback tests on trunk, running per-commit. They will go live at the next buildbot reconfig.

Developer Workflow: numerous enhancements have been made to |mach try|; see list below in the Details section.  run-by-dir has been applied to mochitest-plain on most platforms, and to mochitest-chrome-opt, by kaustabh93, one of team’s contributors. This reduces test bleedthrough, a source of intermittent failures, as well as improves our ability to change job chunking without breaking tests.

Build System: gps has improved test package generation, which results in significantly faster builds – a savings of about 5 minutes per build on OSX and Windows in automation; about 90s on linux.

TaskCluster Migration: linux64 debug builds are now running, so ahal is unblocked on getting linux64 debug tests running in TaskCluster.  armenzg has landed mozharness code to support running buildbot jobs via TaskCluster scheduling, via buildbot bridge.

The Details

bugzilla.mozilla.org

Treeherder

Perfherder/Performance Testing

TaskCluster Support

Mobile Automation

  • mach cppunittest now supports Firefox for Android
  • mach test commands now download host utilities for Firefox for Android
  • [bc] Autophone
  • Bug 1202826 – Autophone – 2015-09-09 deployment
  • Bug 1202833 – Autophone – CHARGING state should not prevent Autophone shutdown/restart
  • Bug 1201061 – Autophone – deploy robocop_adobe_flash.html
  • Bug 1196115 – Intermittent Crash Autophone S1S2Test beginning 2015-08-18
  • Bug 1207836 – Autophone – 2015-09-23 deployment
  • Bug 1205864 –  Autophone – phonetest.py:Logcat collects duplicate messages
  • Bug 1206954 – Autophone – better handle failures to submit results to PhoneDash
  • Bug 1209796 – Autophone – next deployment (In progress)
  • Bug 1205836 – Autophone – investigate orange for remote nytimes s1s2
  • Bug 1208782 – Autophone – do not attempt to get response json during Treeherder submission error if response is None
  • Bug 1209647 – Autophone – eliminate startup check for network connectivity
  • Bug 1209651 – Autophone – do not allow logcat device error to prevent setup_job initialization
  • Bug 1209653 – Autophone – after clearing logcat, specifying -b main can hang
  • Bug 1209675 – Autophone – Logcat should use PhoneTest loggerdeco
  • Bug 1209691 – Autophone – handle incorrect logcat dates emitted by devices.
  • jmaher/wlach working to get Autophone Talos reporting results to PerfHerder

Firefox and Media Automation

  • [maja_zf] MSE Video Playback buildbot jobs will be deployed to run per-commit on mozilla-inbound any day now…

General Automation

  • [ahal] started work on reftest using structured logging
  • [ahal] consolidate mochitest + xpcshell’s StructuredLog.jsm
  • [jgraham] Landed new |mach try| implementation that passes test paths rather than manifest paths; this adds support for web-platform-tests in |mach try|
  • [jgraham] Added support for saving and reusing try strings in |mach try|
  • [jgraham] Added Talos support to |mach try|
  • [jgraham] reftest and xpcshell test harnesses now take paths to multiple test locations on the command line and expose more functionality through mach
  • [jmaher] Kaustabh93 has runbydir live for mochitest-plain osx debug, and mochitest-chrome opt;  All that is left is mochitest-chrome debug and linux64 ASAN e10s.
  • [ato] Support for running Marionette tests using `mach try` in review

ActiveData

WebDriver (highlights)

  • [ato] Defined remote end steps for Element Clear command
  • [ato] Element location strategies have been outlined
  • [ato] Added steps to Base64 encode screen capture results
  • [ato] Because implementors have relied on prose from outdated sections, warnings were added to those sections which have yet to be redefined
  • + a ton of various fixes and rewording

Marionette

  • [ato] findChildElement and findChildElements commands removed

bughunter

  • [bc] Have been keeping the system running, helping triage bugs
  • [tomcat] Has been filing bugs, sent a September status report to internal set of people.

bugherder

  • bugs 924405/1199788 – Bugherder now uses Bugzilla’s native REST API and can use bugzilla api keys for authentication even when 2FA is enabled.

Firefox build system

  • [gps] Test packaging is now drastically faster in automation. 50% reduction across all platforms. This is a 5+ minute decrease on OS X build jobs!

Engineering Productivity Update, Sept 10, 2015

Highlights

Bugzilla: The BMO has been busy implementing security enhancements, and as a result, BMO now supports two-factor authentication.  Setting this up is easy through BMO’s Preferences page.

Treeherder: The size of the Treeherder database dropped from ~700G to around ~80G thanks to a bunch of improvements in the way we store data.  Jonathan French is working on improvements to the Sheriff Panel.  And Treeherder is now ingesting data that will be used to support Automatic Starring, a feature we expect to be live in Q4.

Perfherder and Performance: Will Lachance has published a roadmap for Perfherder, and has landed some changes that should improve Perfherder’s performance.  Talos tests on OSX 10.10 have been hidden in Perfherder because the numbers are very noisy; the reason for this is not currently known.  Meanwhile, Talos has finally landed in mozilla-central, which should make it easier to iterate on.  Thanks to our contributor Julien Pagès for making this happen!  Joel Maher has posted a Talos update on dev.platform with many more details.

MozReview and Autoland: The web UI now uses BMO API keys; this should make logins smoother and eliminate random logouts.  Several UI improvements have been made; see full list in the “Details” section below.

Mobile Automation: Geoff Brown has landed the much-requested |mach android-emulator| command, which makes it much easier to run tests locally with an Android emulator.  Meanwhile, we’re getting closer to moving the last Talos Android tests (currently running on panda boards) to Autophone.

Developer Workflow: Our summer intern, Vaibhav Agrawal, landed support for an initial version of |mach find-test-chunk|, which can tell you which chunk a test gets run in.  This initial version supports desktop mochitest only.  Vaibhav gave an intern presentation this week, “Increasing Productivity with Automation”.  Check it out!

General Automation: James Graham has enabled web-platform-tests-e10s on try, but they’re hidden pending investigation of tests which are problematic with e10s enabled.  Joel Maher and Kim Moir in Release Engineering have tweaked our SETA coalescing, so that lower prioritized jobs are run at least every 7th push, or every hour; further increasing the coalescing window will wait until we have better automatic backfililng in place.  Meanwhile, the number of chunks of mochitest-browser-chrome has been increased from 3 to 7, with mochitest-devtools-chrome soon to follow.  This will make backfilling faster, as well as improving turnaround times on our infrastructure.

hg.mozilla.org: The bzexport and bzpost extensions have been updated to support BMO API keys.

WebDriver and Marionette: Several changes were made to the WebDriver specification, including new chapters on screen capture and user prompts and modal handling.

Bughunter: Our platform coverage now includes opt and debug builds of linux32, linux64, opt-asan-linux64, OSX 10.6, 10.8, 10.9, and windows7 32- and 64-bit.

The Details

bugzilla.mozilla.org
Treeherder
  • A number of optimizations have reduced Treeherder’s db size from ~700GB to ~80GB!
  • For those with Sheriff access, an improved layout for the Sheriff panel will appear soon on production https://tojonmz.wordpress.com/2015/09/04/layout-improvements-to-the-sheriff-panel/ with similar work planned soon for the Filter panel
  • Among other fixes and improvements, Logviewer UI more gracefully handles additional incomplete log states: unknown log steps (1193222) and expired jobs (1193222)
  • A new Help menu has been added with useful links for all users (1199078)
  • We are now storing the data required for the autostarring project. That means storing every single crash/test failure/log error line from the structured log (1182464).
Perfherder/Performance Testing
MozReview/Autoland
  • Autoland VCS interaction performance improvements
  • MozReview web UI now using BMO API keys
  • Login in smoother and faster and random logouts should be a thing of the past
  • MozReview Mercurial client extension now requires BMO API keys
  • No more defining password and/or cookie in plaintext config files
  • “Ship It” is now determined by reviewer status so random people can’t grant Ship It
  • Messaging during pushing is more clear about what is happening
  • “commitid” hidden metadata is now preserved across `hg rebase`
  • Review buttons and text links in web interface are now consolidated to reduce confusion
  • Empty Try syntax is now rejected properly
  • Try trigger button has been moved to an “Automation” drop-down
  • pull and import commands are now displayed
  • Bugzilla’s commit list is now properly ordered
TaskCluster Support
  • armenzg – Work is underway to support running Buildbot test jobs through TaskCluster
  • ted – successful Mac cross build with Breakpad last week, landing patches and fixing some fallout from Linux TC build switch to CentOS 6
Mobile Automation
Dev Workflow
  • vaibhav1994 – A basic version of find-test-chunk has landed. This will help in determining on which chunk a particular test is present in production. It works for mochitest for desktop platforms, see various options with ‘./mach find-test-chunk’
  • vaibhav1994 – –rebuild-talos option now present in trigger-bot to trigger only talos jobs a certain number of times on try.
Firefox and Media Automation
  • sydpolk – Network bandwidth limiting tests have been written; working to deploy them to Jenkins.
  • sydpolk – Streamlined Jenkins project generation based on Jenkins python API (found out about this at the Jenkins conference last week)
  • sydpolk – Migration of hardware out of MTV2 QA lab won’t happen this quarter because Network Ops people are shutting down the Phoenix data center.
  • maja_zf – mozharness script for firefox-media-tests has been refactored into scripts for running the tests in buildbot and our Jenkins instance
General Automation
  • chmanchester – psutil 3.1.1 is now installed on all test slaves as a part of running desktop unit tests. This will help our test harnesses manages subprocesses of the browser, and particularly kill them to get stacks after a hang.
  • armenzg – Firefox UI tests can now be called through a python wrapper instead of only through a python binary. This is very important since it was causing Windows UAC prompts on Release Engineering’s Windows test machines. The tests now execute well on all test platforms.
  • jgraham – web-platform-tests-e10s now running on try, but still hidden pending some investigation of tests that are only unstable with e10s active
  • SETA work is ongoing to support new platforms, tests, and jobs.  
ActiveData
  • [ekyle] Queries into nested documents pass tests, but do not scale on the large cluster; startup time is unacceptable.  Moving work to separate thread for quick startup, with the hope a complicated query will not arrive until the metadata is finished collecting
  • [ekyle] Added auto-restart on ETL machines that simply stop working (using CloudWatch); probably caused by unexpected data, which must be looked into later.
  • [ekyle] SpotManager config change for r3.* instances
hg.mozilla.org
  • Add times for replication events on push
  • Reformat pushlog messages on push to be less eye bleedy
  • bzexport and bzpost extensions now support Bugzilla API Keys
WebDriver
  • [ato] Specified element interaction commands
  • [ato] New chapter on user prompts and modal handling
  • [ato] New chapter on screen capture
  • [ato] Cookies retrieval bug fixes
  • [ato] Review of normative dependencies
  • [ato] Security chapter review
Marionette
  • Wires 0.4 has been released.
  • [ato] Deployed Marionette protocol changes, bringing Marionette closer to the WebDriver standard
  • [ato] Assorted content space commands converted to use new dispatching technique
  • [jgraham] Updated wires for protocol changes
bughunter
Now running opt, debug tinderbox builds for Linux 32 bit, 64 bit; OSX 10.6, 10.8, 10.9; Windows 7 32 bit, 64 bit; opt asan tinderbox builds for Linux 64 bit.
  • bug 1180749 Sisyphus – Django 1.8.2 support
  • bug 1185497 Sisyphus – Bughunter – use ASAN builds for Linux 64 bit workers
  • bug 1192646 Sisyphus – Bughunter – use crashloader.py to upload urls to be resubmitted
  • bug 1194290 Sisyphus – Bughunter – install gtk3 on rhel6 vms

Engineering Productivity Update, August 26, 2015

It’s PTO season and many people have taken a few days or week off.  While they’re away, the team continues making progress on a variety of fronts.  Planning also continues for GoFaster and addon-signing, which will both likely be significant projects for the team in Q4.

Highlights

Treeherder: camd rolled out a change which collapses chunked jobs on Treeherder, reducing visual noise.  In the future, we plan on significantly increasing the number of chunks of many jobs in order to reduce runtimes, so this change makes that work more practical.  See camd’s blog post.  emorley has landed a change which allows TaskCluster job errors that occur outside of mozharness to be properly handled by Treeherder.

Automatic Starring: jgraham has developed a basic backend which supports recognizing simple intermittent failures, and is working on integrating that into Treeherder; mdoglio is landing some related database changes. ekyle has received sheriff training from RyanVM, and plans to use this to help improve the automated failure recognition algorithm.

Perfherder and Performance Testing: Datazilla has finally been decommissioned (R.I.P.), in favor of our newest performance analysis tool, Perfherder.  A lot of Talos documentation updates have been made at https://wiki.mozilla.org/Buildbot/Talos, including details about how we perform calculations on data produced by Talos.  wlach performed a useful post-mortem of Eideticker, with several takeaways which should be applicable to many other projects.

MozReview and Autoland: There’s a MozReview meetup underway, so expect some cool updates next time!

TaskCluster Support: ted has made a successful cross-compiled OSX build using TaskCluster!  Take it for a spin.  More work is needed before we can move OSX builds from the mac mini builders to the cloud.

Mobile Automation: gbrown continues to make improvements on the new |mach emulator| command which makes running Android tests locally on emulator very simple.

General Automation: run-by-dir is live on opt mochitest-plain; debug and ASAN coming soon.  This reduces test “bleed-through” and makes it easier to change chunking.  adusca, our Outreachy intern, is working to integrate the try extender into Treeherder.  And ahal has merged the mozharness “in-tree” configs with the regular mozharness config files, now that mozharness lives in the tree.

Firefox Automation: YouTube ad detection has been improved for firefox-media-tests by maja, which fixes the source of the top intermittent failure in this suite.

Bughunter: bc has got asan-opt builds running in production, and is working on gtk3 support.

hg.mozilla.org: gps has enabled syntax highlighting in hgweb, and has added a new JSON API as well.  See gps’ blog post.

The Details

bugzilla.mozilla.org
Treeherder
Perfherder/Performance Testing
  • talos cleanup and preparation to move in-tree
  • perfherder database cleanup in progress for simpler and more optimized queries. This is mainly preparatory work for making perfherder capable of managing/starring performance alerts, but as a bonus perfherder compare view should load virtually instantly once this is finished. 
  • most talos wiki docs are updated: https://wiki.mozilla.org/Buildbot/Talos
TaskCluster Support
Mobile Automation
  •  [gbrown] Working on “mach emulator” support: wip can download and run 2.3, 4.3, or x86 emulator images. Integrating with other mach commands like “install” and “mochitest”.
  •  [gbrown] Updated mochitest manifests to run most dom/media mochitests on Android 4.3 (under review, bug 1189784)
Firefox and Media Automation
  • [maja_zf] Improved ad detection on YouTube for firefox-media-tests, which fixes our top intermittent failure for long-running playback tests.
General Automation
  •  run-by-dir is live for mochitest-plain (opt only); debug is coming soon, followed by ASAN.
  • Mozilla CI tools is moving from using BuildAPI as the scheduling entry point to use TaskCluster’ scheduling. This work will allow us to schedule a graph of buildbot jobs and their dependencies in one shot. https://bugzil.la/1194264
  • adusca is integrating into treeherder the ability to extend the jobs run for any push. This is based on the http://try-extender.herokuapp.com prototype. Follow along in https://bugzil.la/1194830
  • Git was deployed to the test machines. This is necessary to make the Firefox UI update tests work on them.
  • [ahal] merge mozharness in-tree configs with the main mozharness configs
ActiveData
  • Bug fixes to the ETL – fix bad lookups on hg repo, mostly l10n builds 
  • More error reporting on ETL – Structured logging has changed a little, handle the new variations, and be more elegant when it comes to unknowns, an complain when there is non-conformance.
  • Some work on adding hg `repo` table – acts as a cache for ETL, but can be used to calculate ‘per push’ statistics on OrangeFactor data.
  • Added Talos to the `perf` table – used the old Datazilla ETL code to fill the ES cluster.  This may speed up extracting the replicates, for exploring the behaviour of a test.
  • Enable deep queries – Effectively performing SQL join on Elasticsearch – first attempt did too much refactoring.  Second attempt is simpler, but still slogging through all the resulting test breakage
hg.mozilla.org
WebDriver
  • Updated 
Marionette
  • [ahal] helped review and finish contributor patch for switching marionette_client from optparse to argparse
  • Corrected UUID used for session ID and element IDs
  • Updated dispatching of various marionette calls in Gecko
bughunter
  • [bc] Have asan-opt builds running in production. Finalizing patch. Still need to build gtk3 for rhel6 32bit in order to stop using custom builds and support opt in addition to debug.
charts.mozilla.org
  • Updated the hierarchical burndowns to EPM’s important metabugs that track features 
  • More config changes

Engineering Productivity Update, August 13, 2015

From Automation and Tools to Engineering Productivity

Automation and Tools” has been our name for a long time, but it is a catch-all name which can mean anything, everything, or nothing, depending on the context. Furthermore, it’s often unclear to others which “Automation” we should own or help with.

For these reasons, we are adopting the name “Engineering Productivity”. This name embodies the diverse range of work we do, reinforces our mission (https://wiki.mozilla.org/Auto-tools#Our_Mission), promotes immediate recognition of the value we provide to the organization, and encourages a re-commitment to the reason this team was originally created—to help developers move faster and be more effective through automation.

The “A-Team” nickname will very much still live on, even though our official name no longer begins with an “A”; the “get it done” spirit associated with that nickname remains a core part of our identity and culture, so you’ll still find us in #ateam, brainstorming and implementing ways to make the lives of Mozilla’s developers better.

Highlights

Treeherder: Most of the backend work to support automatic starring of intermittent failures has been done. On the front end, several features were added to make it easier for sheriffs and others to retrigger jobs to assist with bisection: the ability to fill in all missing jobs for a particular push, the ability to trigger Talos jobs N times, the ability to backfill all the coalesced jobs of a specific type, and the ability to retrigger all pinned jobs. These changes should make bug hunting much easier.  Several improvements were made to the Logviewer as well, which should increase its usefulness.

Perfherder and performance testing: Lots of Perfherder improvements have landed in the last couple of weeks. See details at wlach’s blog post.  Meanwhile, lots of Talos cleanup is underway in preparation for moving it into the tree.

MozReview: Some upcoming auth changes are explained in mcote’s blog post.

Mobile automation: gbrown has converted a set of robocop tests to the newly enabled mochitest-chrome on Android. This is a much more efficient harness and converting just 20 tests has resulted in a reduction of 30 minutes of machine time per push.

Developer workflow: chmanchester is working on building annotations into moz.build files that will automatically select or prioritize tests based on files changed in a commit. See his blog post for more details. Meanwhile, armenzg and adusca have implemented an initial version of a Try Extender app, which allows people to add more jobs on an existing try push. Additional improvements for this are planned.

Firefox automation: whimboo has written a Q2 Firefox Automation Report detailing recent work on Firefox Update and UI tests. Maja has improved the integration of Firefox media tests with Treeherder so that they now officially support all the Tier 2 job requirements.

WebDriver and Marionette: WebDriver is now officially a living standard. Congratulations to David Burns, Andreas Tolfsen, and James Graham who have contributed to this standard. dburns has created some documentation which describes which WebDriver endpoints are implemented in Marionette.

Version control: The ability to read and extra metadata from moz.build files has been added to hg.mozilla.org. This opens the door to cool future features, like the ability auto file bugs in the proper component and automatically selecting appropriate reviewers when pushing to MozReview. gps has also blogged about some operational changes to hg.mozilla.org which enables easier end-to-end testing of new features, among other things.

The Details

bugzilla.mozilla.org
Treeherder/Automatic Starring
  • almost finished the required changes to the backend (both db schema and data ingestion)
Treeherder/Front End
  • Several retrigger features were added to Treeherder to make merging and bisections easier:  auto fill all missing/coalesced jobs in a push; trigger all Talos jobs N times; backfill a specific job by triggering it on all skipped commits between this commit and the commit that previously ran the job, retrigger all pinned jobs in treeherder.  This should improve bug hunting for sheriffs and developers alike.
  • [jfrench] Logviewer ‘action buttons’ are now centralized in a Treeherder style navbar https://bugzil.la/1183872
  • [jfrench] Logviewer skipped steps are now recognized as non-failures and presented as blue info steps https://bugzil.la/1192195, https://bugzil.la/1192198
  • [jfrench] Middle-mouse-clicking on a job in treeherder now launches the Logviewer https://bugzil.la/1077338
  • [vaibhav] Added the ability to retrigger all pinned jobs (bug https://bugzil.la/1121998)
  • Camd’s job chunking management will likely land next week https://bugzil.la/1163064
Perfherder/Performance Testing
  • [wlach] / [jmaher] Lots of perfherder updates, details here: http://wrla.ch/blog/2015/08/more-perfherder-updates/ Highlights below
  • [wlach] The compare pushes view in Perfherder has been improved to highlight the most important information.
  • [wlach] If your try push contains Talos jobs, you’ll get a url for the Perfherder comparison view when pushing (https://bugzil.la/1185676).
  • [jmaher/wlach] Talos generates suite and test level metrics and perfherder now ingests those data points. This fixes results from internal benchmarks which do their own summarization to report proper numbers.
  • [jmaher/parkouss] Big talos updates (thanks to :parkouss), major refactoring, cleanup, and preparation to move talos in tree.
MozReview/Autoland
Mobile Automation
  •  [gbrown] Demonstrated that some all-javascript robocop tests can run more efficiently as mochitest-chrome; about 20 such tests converted to mochitest-chrome, saving about 30 minutes per push.
  •  [gbrown] Working on “mach emulator” support: wip can download and run 2.3, 4.3, or x86 emulator images. Sorting out cache management and cross-platform issues.
  •  [jmaher/bc] landed code for tp4m/tsvgx on autophone- getting closer to running on autophone soon.
Dev Workflow
  • [ahal] Created patch to clobber compiled python files in srcdir
  • [ahal] More progress on mach/mozlog patch https://bugzil.la/1027665)
  • [chmanchester] Fix to allow ‘mach try’ to work without test arguments (bug 1192484)
Media Automation
  • [maja_zf] firefox-media-tests ‘log steps’ and ‘failure summaries’ are now compatible with Treeherder’s log viewer, making them much easier to browse. This means the jobs now satisfy all Tier-2 Treeherder requirements.
  • [sydpolk] Refactoring of tests after fixing stall detection is complete. I can now take my network bandwidth prototype and merge it in.
Firefox Automation
General Automation
  • Finished adapting mozregression (https://bugzil.la/1132151) and mozdownload (https://bugzil.la/1136822) to S3.
  • (Henrik) Isn’t archive.mozilla.org only a temporary solution, before we move to TC?
  • (armenzg) I believe so but for the time being I believe we’re out of the woods
  • The manifestparser dependency was removed from mozprofile (bug 1189858)
  • [ahal] Fix for https://bugzil.la/1185761
  • [sydpolk] Platform Jenkins migration to the SCL data center has not yet begun in earnest due to PTO. Hope to start making that transition this week.
  • [chmanchester] work in progress to build annotations in to moz.build files to automatically select or prioritize tests based on what changed in a commit. Strawman implementation posted in https://bugzil.la/1184405 , blog post about this work at http://chmanchester.github.io/blog/2015/08/06/defining-semi-automatic-test-prioritization/
  • [adusca/armenzg] Try Extender (http://try-extender.herokuapp.com ) is open for business, however, a new plan will soon be released to make a better version that integrates well with Treeherder and solves some technichal difficulties we’re facing
  • [armenzg] Code has landed on mozci to allow re-triggering tasks on TaskCluster. This allows re-triggering TaskCluster tasks on the try server when they fail.
  • [armenzg] Work to move Firefox UI tests to the test machines instead of build machines is solving some of the crash issues we were facing
ActiveData
  • [ahal] re-implemented test-informant to use ActiveData: http://people.mozilla.org/~ahalberstadt/test-informant/
  • [ekyle] Work on stability: Monitoring added to the rest of the ActiveData machines.  
  • [ekyle] Problem:  ES was not balancing the import workload on the cluster; probably because ES assumes symmetric nodes, and we do not have that.  The architecture was changed to prefer a better distribution of work (and query load) – There now appears to be less OutOfMemoryExceptions, despite test-informant’s queries.
  • [ekyle] More Problems:  Two servers in the ActiveData complex failed: The first was the ActiveData web server; which became unresponsive, even to SSH.  The machine was terminated.  The second server was the ‘master’ node of the ES cluster: This resulted in total data loss, but it was expected to happen eventually given the cheap configuration we have.   Contingency was in place:  The master was rebooted, the  configuration was verified, and data re-indexed from S3.   More nodes would help with this, but given the rarity of the event, the contingency plan in place, and the low number of users, it is not yet worth paying for. 
WebDriver (highlights)
  • [ato] WebDriver is now officially a living standard (https://sny.no/2015/08/living)
  • [ato] Rewrote chapter on extending the WebDriver protocol with vendor-specific commands
  • [ato] Defined the Get Element CSS Value command in specification
  • [ato] Get Element Attribute no longer conflates DOM attributes and properties; introduces new command Get Element Property
  • [ato] Several significant infrastructural issues with the specification was fixed
Marionette
hg.mozilla.org
charts.mozilla.org
  • Project managers for FxOS have a renewed interest in the project tracking, and overall status dashboards.   Talk only, no coding yet.

A-Team Update, July 29, 2015

Highlights

Treeherder: We’ve added to mozlog the ability to create error summaries which will be used as the basis for automatic starring.  The Treeherder team is working on implementing database changes which will make it easier to add support for that.  On the front end, there’s now a “What’s Deployed” link in the footer of the help page, to make it easier to see what commits have been applied to staging and production.  Job details are now shown in the Logviewer, and a mockup has been created of additional Logviewer enhancements; see bug 1183872.

MozReview and Autoland: Work continues to allow autoland to work on inbound; MozReview has been changed to carry forward r+ on revised commits.

Bugzilla: The ability to search attachments by content has been turned off; BMO documentation has been started at https://bmo.readthedocs.org.

Perfherder/Performance Testing: We’re working towards landing Talos in-tree.  A new Talos test measuring tab-switching performance has been created (TPS, or Talos Page Switch); e10s Talos has been enabled on all platforms for PGO builds on mozilla-central.  Some usability improvements have been made to Perfherder – https://treeherder.mozilla.org/perf.html#/graphs.

TaskCluster: Successful OSX cross-compilation has been achieved; working on the ability to trigger these on Try and sorting out details related to packaging and symbols.  Work on porting Linux tests to TaskCluster is blocked due to problems with the builds.

Marionette: The Marionette-WebDriver proxy now works on Windows.  Documentation on using this has been added at https://developer.mozilla.org/en-US/docs/Mozilla/QA/Marionette/WebDriver.

Developer Workflow: A kill_and_get_minidump method has been added to mozcrash, which allows us to get stack traces out of Windows mochitests in more situations, particularly plugin hangs.  Linux xpcshell debug tests have been split into two chunks in buildbot in order to reduce E2E times, and chunks of mochitest-browser-chrome and mochitest-devtools-chrome have been re-normalized by runtime across all platforms.  Now that mozharness lives in the tree, we’re planning on removing the “in-tree configs”, and consolidating them with the previously out-of-tree mozharness configs (bug 1181261).

Tools: We’re testing an auto-backfill tool which will automatically retrigger coalesced jobs in Treeherder that precede a failing job.  The goal is to reduce the turnaround time required for this currently manual process, which should in turn reduce tree closure times related to test failures

The Details

bugzilla.mozilla.org

Treeherder/Automatic Starring

  • We’re generating error summaries now that will serve as the basis for automatic starring work.

Treeherder/Front End

  • New “What’s Deployed” feature in Help footer to view stage/prod deployment status
  • Logviewer now contains the full ‘Job Info’ aka. tinderbox printlines (bug 1092209)
  • Created a mock of logviewer UI changes (bug 1183872)

Perfherder/Performance Testing

  • Working towards moving Talos code in-tree (bug 787200)
  • New Talos test TPS (Talos Page Switch) (bug 1166132)
  • Fixed a few data ingestion/duplication cases.
  • Adjusting calculation of suite summaries to match graph server, not finished yet (tracking: bug 1184968)
  • e10s on all platforms, only runs on mozilla-central for pgo builds, broken tests, big regressions are tracked in bug 1144120
  • perfherder is easier to use, some polish on test selection and the  compare view, and most importantly we have found a few odd bugs that has  caused duplicate data to show up, check it out: https://treeherder.mozilla.org/perf.html#/graphs
  • Starting the work of moving Android Talos to Autophone (bug 1170685)

MozReview/Autoland

  • bug 1184079 – Fix for autopublishing when authenticating to MozReview via BMO cookies
  • bug 1178025 – Commits table looks nicer
  • bug 1175166 – r+ is now carried forward on commits from level 3 authors

TaskCluster Support

Mobile Automation

  • Continued work on porting android talos tests to autophone, remaining work is to figure out posting results and ensuring it runs on a regular basis and reliable.
  • Support for the Android stock browser and Dolphin has been added to mozbench (bug 1103134)

Dev Workflow

  • Created patch that replaces mach’s logger with mozlog. Still several rough edges and perf issues to iron out

Media Automation

  • The new MSE rewrite is now enabled by default on Nightly and we’re replacing a few tests in response: bug 1186943 – detection of video stalls has to repond to new internal strings from new MSE implementation by :jya.
  • firefox-media-tests mozharness log is now parsed into steps for Treeherder’s Log Viewer
  • Fixed a problem with automation scripts for WebRTC tests for Windows 64.

General Automation

  • Moved mozlog.structured to top-level mozlog, and released mozlog 3.0
  • Added a kill_and_get_minidump method to mozcrash (bug 890026). As a result we’re getting minidumps out of Windows mochitests under more circumstances (in particular, plugin hangs in certain intermittently failing tests).
  • The MozillaPulse consumer now supports listening to multiple exchanges simultaneously (bug 1180897).
  • Bug 1186420 – Autophone – update requirements and deploy thclient 1.6
  • Bughunter moved to SCL3 without interruption
  • Bug 1185498 – Sisyphus – Bughunter – consume urls directly from Socorro
  • linux debug xpcshell was split into two chunks to reduce E2E times (bug 1185499)
  • runtimes for mochitest-browser-chrome and mochitest-devtools have been renormalized across all platforms
  • Allow Firefox UI tests to determine where to get Firefox crash symbols for releases and improve reproducibility
  • Testing auto-backfill in production (bug 1180732)
  • Now that mozharness lives in the tree, we’re going to remove the “in-tree configs”, which will consolidate mozharness options and make maintenance simpler (bug 1181261)

ActiveData

  • ActiveData requires monitoring on all nodes before it can be left alone for more than a day without it failing:
    • Made  fork of Supervisor to run simple Cron jobs – the biggest task was  finding and installing (and compiling!) the C libraries used
    • Added  Supervisor to spot instances to monitor ES; not just the process, but  query response time.  Also monitoring the indexing jobs.
  • Replicated OrangeFactor to ActiveData so masters student (and the public) we can query it, or extract it.

Marionette

  • Landed Proxy support via capabilities
  • Updating cookie support to return httpOnly flag
  • Added a –version arg to Marionette (bug 1183157)
  • Landing support for W3C Compatible Drivers in Selenium Tree and released 2.46.1 so users can use it.
  • Wrote a small guide to use it https://developer.mozilla.org/en-US/docs/Mozilla/QA/Marionette/WebDriver
  • Marionette<->WebDriver Proxy now works on Windows, Linux and OSX as of 0.3.0

Automation and Tools Team Update, July 16 2015

The Automation and Tools Team (the A-Team, for short) is a large team that oversees a diverse set of services, tools and test harnesses used by nearly everyone at Mozilla.  We’re borrowing a page from Release Engineering and publishing a series of updates to inform people about what we’re up to, in the hopes of fostering better visibility and inter-team coordination.

Highlights

Treeherder and Automatic Starring: Our focus for Treeherder in Q3 will be improving the signal-to-noise ratio for dealing with intermittent oranges. An overall design has been agreed to for the “automatic starring” project, and work has begun; final rollout is likely in Q4. This quarter, we’ll also stop spamming Bugzilla with comments for each intermittent, but we will put in place an alternate notification system for people who rely on Bugzilla orange comments to determine when an intermittent needs attention. We’ve also agreed on a redesign for the Logviewer that should result in a more useful and intuitive interface.
MozReview and Autoland:  MozReview now offers to publish review requests when you push, so it isn’t necessary to visit the MozReview’s UI. Work has started on adding support for autoland-to-inbound, which will allow developers to push changes to inbound directly from MozReview… no more battling tree closures!
Performance: Work continues on Perfherder’s “comparison mode”, a view that compares Talos performance data between two revisions. See wlach’s blog post for more details.
 
TaskCluster Support: We’re helping Release Engineering migrate from Buildbot to TaskCluster; this quarter we’re standing up Linux tests in TaskCluster and getting OS X cross-compilation to work so that we can move those builds to the cloud.
BMO now has tests running in continuous integration using TaskCluster and reporting to Treeherder.
Mobile Automation: mochitest-chrome for Android is now live! Work is also underway to enable debug reftests on Android emulators, and significant reliability improvements have been landed in Autophone.
Desktop Automation: Work is in progress to get Thread Sanitizer (TSan) builds running on try and to split gTest into its own test chunk. We’re also working towards applying –run-by-dir to mochitest-plain, in order to improve isolation and enable smarter chunking in CI.
Developer Workflow: We’re adding test-selection flexibility to the reftest harness as a prelude to making ‘mach try’ work with more test types.

The Details

Bugzilla.mozilla.org
Treeherder/Automatic Starring
  • Work has started on backend work needed to support automatic starring, including db simplification, and db unification (so each tree doesn’t have its own database).  Bug 1179263 tracks this work.  As a side effect of this work, Treeherder code should become less complex and easier to maintain.
  • Work has started on identifying what needs to happen in order to turn off Bugzilla comments for intermittents, and to create an alternative notification mechanism instead.  Bug 1179310 tracks this work.
Treeherder/Front End
  • New shortcuts for Logviewer, Delete Classification plus improved classification save
  • Design work is in progress for collapsing chunks in Treeherder in order to reduce visual noise in bug 1163064
Perfherder/Performance Testing
  • Evaluating alerts generated from PerfHerder
  • Improvements to compare chooser and viewer inside of PerfHerder
  • Work towards building a new tab switching test (bug 1166132)
MozReview/Autoland
  • Automatic publishing of reviews upon pushing
  • Known bug: people using cookie auth may experience bug 1181886
  • Better error message when MozReview’s Bugzilla session has expired (bug 1178811)
  • Pruned user database to improve user searching (bug 1171274)
  • Work is progressing on autoland-to-inbound (bug 1128039)
TaskCluster Support
  • Ability to schedule Linux64 tests on try (tests not running yet due to a couple blockers) – bug 1171033
  • Working on OSX cross-compilation, which will allow us to move OSX builds to the cloud; this will make OSX builds much faster in CI.
Mobile Automation
  • Autophone detects USB lock-ups and gracefully restarts. This is a huge improvement in system reliability.
  • Continued work on getting Android Talos tests ported to Autophone (bug 1170685)
  • Updated manifests and mozharness configs for mochitest-chrome (bug 1026290)
  • Determined total-chunks requirements for Android 4.3 Debug reftests (bug 1140471)
  • Re-wrote robocop harness to significantly improve run-time efficiency (bug 1179981)
Dev Workflow
  • Helped RelEng resolve some problems that were preventing them from landing mozharness in the tree.  This opens the door to a lot of future dev workflow improvements, including better unification of the ways we run automated tests in continuous integration and locally.  We’ve wanted this for years and it’s great to see it finally happen.
  • Did some work on top of jgraham’s patch to make mach use mozlog structured logging
Platform QA
  • We had to respond to the breakup of .tests.zip into several files to keep our Jenkins instance running.
  • Getting firefox-media-tests to satisfy Tier-2 Treeherder visibility requirements involves changing how Treeherder accommodates non-buildbot jobs (e.g bug 1182299)
General Automation
  • Working on running multiple tests/manifests through reftests harness as a prelude for supporting |mach try| for more test types.
  • Created patch to move mozlog.structured to the top level package (and what was previously there to mozlog.unstructured)
  • Figured out the series of steps needed to produce a usable Thread Sanitizer enabled linux build on our infra
  • Separating out gTest into a separate job in CI – bug 1179955.
ActiveData
  • More memory optimizations (motivation: releng query for Chris Atlee:  query slow tests)
    • run staging environment as stability test for production
    • change etl procedure so pushing changes to prod are easier (moving toward standard procedure)
  • import treeherder data markup to active data (motivation: characterizing test failures
    • ateam query: summary of test failures, stars and resolutions (bug 1161268bug 1172048)
    • subtests are too large for download of more than one day – working on code to only pull what’s required

 

Mozilla A-Team: B2G Test Automation Update

This post describes the status of the various pieces of B2G test automation.

Jenkins Continuous Integration

We use a Jenkins instance to run continuous integration tests for B2G, using B2G emulators.  Unfortunately, this has been unable to run any tests for several weeks due to incompatibilities between the emulator and the headless Amazon AWS linux VM’s we have been running the CI on, which have arisen due to the work on hardware acceleration in B2G.  Michael Wu has identified a new VM configuration which does work (Ubuntu 12.04 + Xorg + xorg-video-dummy), and I’m busy switching our CI over to new VM’s of this configuration.  The WebAPI tests are already running again, and the rest will be soon.

As soon as tests are rolling again normally, those of us most closely involved in B2G test automation (myself, Malini Das, Andrew Halberstadt, and Geo Mealer) will institute some informal sheriffing on Autolog (a TBPL look-alike) to help keep track of test failures.  If you’d like to help with this effort, let me know.

Automation Stability

Our B2G test automation has gone down for weeks at a time on several occasions over the past few months.   Typically this has one of two causes:

  1. Changes to B2G which break the emulator.  These are identified fairly quickly, but can take a week or longer to resolve, as they require engineering resources that are busy with other things.  Now that B2G has reached “feature complete” stage, it may be that such breaking changes will be less frequent.  Usually, this kind of breakage prevents the emulator from launching successfully, rather than resulting in a build error.  To help identify these more quickly, I will write a simple “launch the emulator” test which gets performed after every build; if this test fails, it will automatically message the B2G mailing list.
  2. Changes to non-Marionette code in mozilla-central which break Marionette.  Typically these changes have occurred in the remote debugger, but we’ve also seen them with JS and browser code.  To address this, we’re working on getting Marionette unit tests in TBPL using desktop Firefox:  bug 770769.  Once these are live, changes which break Marionette will get caught by try or mozilla-inbound and won’t be allowed to propagate to mozilla-central where they end up breaking B2G CI.

Test Harness Status

WebAPI:  running again, 2 intermittent oranges: bug 760199 and bug 779217.

Mochitest:  will be running soon.  We’re currently only running the subset of tests that used to be run by Fennec.  We know we want to run all of them, but running all of them results in so many timeouts that the harness aborts.  We’ll need to spend some time triaging these.  We also know we want to change the way we run mochitests so that we can run them out-of-process: bug 777871.

XPCShell tests:  running locally with good results, thanks to Mihnea Balaur, an A-Team intern.  We will add them to the CI after mochitests.

Reftests:  Andrew Halberstadt has these running locally and is working to understand test failures (bug 773842).  He will get them running on a daily basis on a linux desktop with an Nvidia GPU, reporting to the same Autolog instance used by our Jenkins CI.  If we need more frequent coverage and running them on the Amazon VM’s would provide useful data, we can do that.  The reftest runner also needs to be modified so that it runs tests out-of-process: bug 778072.

Eideticker:  Malini Das is working to adapt William Lachance’s Eideticker harness to B2G.  This will be used to generate frame-rate data for inter- and intra-app transitions.  The testing will be performed on panda boards.  See bug 769167.

Other performance tests:  There are no plans at this time to port talos to B2G.  Malini has written a simple performance test using Marionette, which tracks the amount of time needed to launch each of the Gaia apps on an emulator.  This has suffered from the same emulator problems described above, and needs to be moved to a new VM.  This test currently reports to a new graphserver system called Datazilla, which isn’t in production yet.  Once it goes live, we’ll be able to analyze the data and see whether the current test provides useful data, and what other tests would be useful to write.

Gaia integration tests:  James Lal has recently added these.  I’ll hook these up to CI soon.

Panda Boards

The emulator is not an ideal test platform for several reasons, most notably poor performance and the fact that it doesn’t provide the real hardware environment that we care about.  But actual phones are often not good automation targets either; they tend to suffer from problems relating to networking, power consumption, and rebooting that make them a nightmare to deal with in large-scale automation.  Because of this, we’re going to target panda boards for test automation on real hardware.  This is the same platform that will be used for Fennec automation, so we can leverage a lot of that team’s work.

There are several things needed for this to happen; see bug 777530.  First, we need to get B2G panda builds in production using buildbot; we need to figure out how to flash B2G on pandas remotely; we need to adapt all the testrunners to work with the panda boards; and we need to write mozharness scripts for B2G unit tests, to allow them to work in rel-eng’s infrastructure.

For reftests, we also need to figure out “the resolution problem”:  the fact that we can’t set the pandas to a resolution that would allow the reftest window to be exactly 800×1000, which is the resolution that test authors assume when writing reftests.  Running reftests at other resolutions is possible, but we don’t know how many false passes we might be seeing, and analyzing the tests to try and determine this is laborious.

There are a lot of dependencies here, so I don’t have a very good ETA.  But when this work is done, we will transition all of testing to pandas on rel-eng infrastructure, except for the WebAPI tests which have been written specifically for the emulator.  This means the tests will show up on TBPL; they’ll be available on try; they will benefit from formal sheriffing. The emulator WebAPI tests will eventually be transitioned to rel-eng as well, if/when rel-eng starts making emulator builds.

Writing WebAPI tests for B2G using Marionette

At Mozilla, we have many different testing frameworks, each of which fills a different niche (although there is definitely some degree of overlap among them). For testing WebAPIs in B2G, some of these existing frameworks can be utilized, depending on the API. For example, mozSettings and mozContacts can be tested using mochitests, since there isn’t much, if anything, that’s device-specific to them. (We’re not currently running mochitests on B2G devices, but will be soon.)

But there are many other WebAPIs which are not testable using any of our standard frameworks, because tests for them need to interact with hardware in interesting ways, and most of our frameworks are designed to operate entirely within a gecko context, and thus have no ability to directly access hardware.

Malini Das and I have been working on a new framework called Marionette which can help. Marionette is a remote test driver, so it can remotely execute test steps within a gecko process while retaining the ability to interact with the outside world, including devices running B2G. When this is combined with the B2G emulator’s ability to query and set hardware state, we have a solution for testing a number of WebAPIs that would be difficult or impossible to test otherwise.

To illustrate how this works, I’m going to walk through the entire process of writing WebAPI tests for mozBattery and mozTelephony, to be run on B2G emulators. We already have such tests running in continuous integration, reporting to autolog. If developers add new Marionette WebAPI tests, they will be run and reported here as well. Eventually, they will likely be migrated over to TBPL.

Building the emulator

These tests will be run on the emulator, so you’ll have to build the B2G Ice Cream Sandwich emulator first, if you don’t have one already.  You’ll need to do this on linux, preferably Ubuntu.  Make sure to install the build prerequisites before you begin, if you haven’t built B2G before.

git clone https://github.com/andreasgal/B2G
cd B2G
make sync (get a cup of coffee, this takes quite a while)
make config-qemu-ics (get another cup of coffee)
make gonk (get another drink, but I think you've had enough coffee by now)
make

You should now have an emulator, which can you launch using:

./emu-ics.sh

After you’ve verified the emulator is working, close it again.

Running a Marionette sanity test

Now we’ll run a single Marionette test to verify that everything is working as expected.   First, ensure that you have Python 2.7 on your system.  Then, install some prerequisites:

pip install (or easy_install) manifestdestiny
pip install (or easy_install) mozhttpd
pip install (or easy_install) mozprocess

Now, from the directory where you cloned the B2G repo:

cd gecko/testing/marionette/client/marionette
python runtests.py --emulator --homedir /path/to/B2G/repo \
  tests/unit/test_simpletest_sanity.py

If everything has gone well, you should see something like the following:

TEST-START test_simpletest_sanity.py
test_is (test_simpletest_sanity.SimpletestSanityTest) ... ok
test_isnot (test_simpletest_sanity.SimpletestSanityTest) ... ok
test_ok (test_simpletest_sanity.SimpletestSanityTest) ... ok

----------------------------------------------------------------------
Ran 3 tests in 2.952s

OK

SUMMARY
-------
passed: 3
failed: 0
todo: 0

Writing a battery test

The B2G emulator allows you to arbitrarily set the battery level and charging state, by telnetting into the emulator’s console port and issuing certain commands.  Marionette has an EmulatorBattery class which abstracts these operations, and allows you to interact with the emulator’s battery using a very simple API.

A simple example is given in the EmulatorBattery documentation on MDN.  Save this example to a file named test_battery_example.py, and run this command:

python runtests.py --emulator --homedir /path/to/B2G/repo /path/to/test_battery_example.py

Marionette should launch an emulator and run the test; when it’s done you should see:

TEST-START test_battery_example.py
test_level (test_battery_example.TestBatteryLevel) ... ok

----------------------------------------------------------------------
Ran 1 test in 0.391s

OK

SUMMARY
-------
passed: 1
failed: 0
todo: 0

How it works

This test, like all Marionette Python tests, is written using Python’s unittest framework, which provides the assert methods used in the test.  Other methods used by the test are provided by the Marionette and EmulatorBattery classes.

When the test executes this line:

self.marionette.emulator.battery.level = 0.25

the EmulatorBattery class telnets into the emulator and sets the battery’s level.  We then read the level back (which invokes another telnet command) to verify that the emulator’s battery state was updated as expected.  And finally, we execute a snippet of JavaScript inside gecko:

moz_level = self.marionette.execute_script("return navigator.mozBattery.level;")

and verify that it returns the same battery level as the emulator is reporting directly.

More tests with hardware interaction

In addition to battery interaction, the B2G emulator allows you to query and set the state of other properties normally set by hardware, like GPS location, network status, and various sensors.  Tests for all these could be written in a similar way.  It probably makes sense to make classes for these similar to EmulatorBattery which abstract the details of getting and setting the state of the underlying hardware.  I would encourage WebAPI developers to add as many WebAPI tests as possible; if you would like us to add convenience classes, please ping us on IRC (jgriffin and mdas, on #ateam or #b2g) or file a bug under Testing:Marionette.

Multi-emulator tests

There are some WebAPIs which cannot be completely tested using  a single device or emulator, like telephony and SMS.  Marionette can help with these too, as Marionette can be used to manipulate two emulator instances which are capable of communicating with each other.

In any tests run with the --emulator switch, Marionette launches an emulator before running the tests, and this emulator is associated with an instance of the Marionette class available to the test as self.marionette. Tests can invoke a second emulator instance using self.get_new_emulator(), and these emulator instances can call and text each other using their port numbers as their phone numbers.

To illustrate how this works, Malini has written an example test in which one emulator is used to dial another, and the caller’s number is verified on the receiver. See this example at https://developer.mozilla.org/en/Marionette/Marionette_Python_Tests/Emulator_Integrated_Tests#Manage_Multiple_Emulators.

If you save this example to test_dial_example.py and run the command:

python runtests.py --emulator --homedir /path/to/B2G/repo /path/to/test_dial_example.py

you should see Marionette launch one emulator, and then after it starts execution of the test, you should see a second emulator instance launch. After the test is done, you should see a successful report, similar to the one shown for the battery test.

We currently have a few tests for mozTelephony, but many more could be added, and new tests should be added for SMS/MMS as well.

Adding new tests to the B2G continuous integration

When new test are ready to be added to the CI, they should be checked into gecko under their dom component, e.g., dom/telephony/test/marionette. They should be added to the manifest.ini file in the same directory, and then for new manifest.ini files, the path to the .ini file should be added to the master manifest at http://mxr.mozilla.org/mozilla-central/source/testing/marionette/client/marionette/tests/unit-tests.ini. After this is done, it should be picked up by the B2G CI, after the gecko fork of B2G is updated, where it will be reported along with the other tests to autolog.

Caveats, provisos, and miscellanea

B2G builds go to sleep after 60 seconds of inactivity.  In the emulator, this “sleep” will completely lock up Marionette if it occurs while a test is running.  This is very inconvenient while testing.  See bug 739476. Until some better mechanism of handling this is available, I usually edit gecko/b2g/apps/b2g.js to increase the value of the power.screen.timeout pref before building, to prevent the emulator from going to sleep.

The current test failures in autolog are being tracked as bug 751403 and bug 751406.

Network access in the emulator currently doesn’t seem to work (see https://github.com/andreasgal/B2G/issues/287).  This prevents some parts of Gaia from working correctly but doesn’t interfere with the above style of WebAPI tests, none of which rely on Gaia or network access.

Building the emulator is very time-consuming, mostly due to the time required to sync all the various repos needed by B2G.  We hope to be able to post emulator builds for download soon, after a few details are worked out.

More reading

What is Marionette

Marionette Python tests

Marionette Emulator tests

the Marionette class

the Emulator class

Please contribute tests

There are many WebAPIs which are less tested than they could be.  Please help us expand test coverage by contributing tests in areas similar to those described above.    If you need help, contact :jgriffin or :mdas on IRC, or file a bug under Testing:Marionette.