< Back to Blog

Important Learnings from the Applied Machine Learning Days 2018

I attended the second Applied Machine Learning Days at the EPFL in Lausanne this year. This event is organized by Marcel Salathe and his team of the Digital Epidemiology lab, where they apply machine learning to uncover dynamics of health and disease in human populations.

Just like last year’s event, the conference balanced sessions between the use of ML (Machine Learning) in various organizations, a framework highlight, spotlight on crowdAI winners, and discussion panels. Though unlike last time, a number of hands-on workshops also took place over the weekend preceding the two-day conference.

At Ness, we come across various organizations expressing interest in applied machine learning. It was therefore interesting to hear organizations such as Cisco, Google, Swisscom, or Bühler Group share their experiences. Jeremiah Harmsen of Google leads a team advancing the use of ML across Google teams. Their activities include ML assignments, education activities and contributions to tooling. Google is particular, in the sense their products enjoy particularly large cohorts of users which often leads to particular requirements. (Note: non-GAFA (Google, Apple, Facebook and Amazon) companies eyeing one of their tools or practices tend to overlook this.) Smart text selection on Android uses ML for assisting humans dropping the infamous pins in the right place. An algorithm will predict a meaningful group of words on our behalf. Another application saves the smartphone battery on the “Now Playing” feature running in the background while allowing anytime song identification. It was interesting to hear how Jeremiah’s team disseminates knowledge in ML through an “ML ninja rotation” programme across various products at Google. They also organize TensorFlow classes as well as basic machine learning courses, both inside and outside of Google. On the tooling front, that team assists with practical considerations e.g. how many layers should I use in my neural network, or what dropout rate should I pick? He elaborated on a technique called “Wide & Deep”, which they use to help engineers come up with an effective structure. The technique involves finding a balance between a shallow perceptron-style model and a deeper model with fully connected layers. Check out the link for more details.

GAFAs are clearly pushing innovation in ML research and tools. Facebook’s Soumith Chintala presented advances in pytorch, one of the leading deep learning frameworks, to accommodate recent trends in dynamic deep learning. However, the event showed how other organizations are also finding ways to make use of these techniques. Swisscom, the Swiss leading mobile operator, presented applications of NLP (Natural Language Processing) and natural language understanding for business applications. Where Google has data coming out of its ears, it is common for other organizations to struggle with data for achieving ML success as Swisscom showed. In order to compensate for this lack of data, Cladiu Musat established cooperation with academia and gets help from students to devise clever techniques to achieve various portions of the pipeline. See for example EmbedRank or unsupervised aspect term extraction with B-LSTM and CRF.

Cisco’s approach to promoting ML across the enterprise resonates very well with Ness’s own view of imagining futures through user-centric thinking. The organization recognizes the need to identify benefits to digital transformation efforts and considers three key elements in delivering solutions which leverage ML:

  • Ability to scale in order to have meaningful impact
  • Have a way of building bridges across multiple stakeholders of such solutions
  • Ask the hard questions early

If you are familiar with Ness’s Connected approach, this should sound familiar. Alison Michan of Bühler Group talked about the uses of ML in optical sorting machines used in the food industry. Among the many solutions provided by the Swiss leader in equipment for food processing and advanced materials manufacturing, optical sorting ejects impurities in e.g. rice production by using cameras and precise air compressors. The image processing algorithms to perform sorting support color-based as well as shape-based sorting. Such machines are complex and require delicate calibration, which is often manual. This can lead to overfitting and the use of ML techniques can help attenuate this effect while reducing the setup time. As we see with other makers of very complex machines, Bühler Group aims ML at predictive maintenance of their equipment in cooperation with the Swiss Data Science Center. The use of ML in manufacturing is very exciting for Ness as well because it helps more partners to build bridges between domain experts, IT and data science teams. On that note, I want to mention that Daniel Whitenack of Pachyderm.io had run a workshop on the weekend and also attended the event. As organizations continue to buy into the potential of ML, they will recognize the need to expand the availability of Python/Scala notebooks and training/dev sets to more and more teams. In a way, notebooks are the “new Excel”, but require a more advanced infrastructure which we address in our discovery and envision workshops.

Applied ML Days is also historically tied to crowdAI contests. That platform is equivalent to Kaggle but open and hosted last year a reinforcement learning competition to train agents how to walk and run. Agents are represented by a musculoskeletal model inside an imposed physics-based environment. The winner of Learn how to Walk presented his approach as well as a project born from his effort in tuning model hyperparameters: eschernode. It’s an online solution which helps explore the solution space in a more visual and convenient manner. The winner of Learn how to Run presented their approach based on proximal policy optimization for solving the perception problem (arxiv paper on various DRL techniques). Perception is a complex problem which ties somewhat with Ashby’s Law of Requisite Variety: the model of the environment of the agent must be balanced with the control architecture it has to affect behaviour and change in that environment. The Learn how to Run team involved six people and 240K CPU-hours over a period of 3 months. Not for the faint-hearted!

While last year’s event put more emphasis on deep learning which is one of the ML families that has attracted the most talent in the past years, it was refreshing to hear Christopher Bishop of Microsoft remind us of the power of Probabilistic Graphical Models (PGM). PGMs are powerful mathematical tools to express knowledge about the world in the form of graphs of random variables, which are typically extended to include parameters that govern the distribution of the same variables. Chris reminded us how PGM folks are able to formulate problems from graphs expressing the relationships between variables and [hidden] parameters and arrive at common methods such as Principal Component Analysis (a common dimensionality reduction technique) or a Kalman Filter (used e.g. for active safety in automotive). It was also an opportunity for Chris to promote his most recent book, Model-Based ML Book with Thomas Diethe, which can be found online at http://mbmlbook.com/ (work-in-progress).

SwissTech Convention Center (Source: EPFL Events)

I will finish with a few words on Raia Hadsell’s talk on deep reinforcement learning at DeepMind and the Panel discussion. You have perhaps watched the AlphaGo documentary on Netflix. Reinforcement learning differs from the more common supervised learning techniques in that it tackles problems where there are no labelled training sets. In such problems, an action leads to a reward in a changing context where there is no pre-known label to optimize actions. This leads to different cost functions. DeepMind has applied this to playing ATARI games, teach an agent to win in chess without databases of openings or endgames or win in Go. Raia came to talk about the use of end-to-end RDL in robotics. Robots present a more challenging setting to the DeepMind team with tight feedback loops between the machine with multiple degrees of freedom and its environment. They are specifically exploring how to deal with multiple tasks, how to learn efficiently without having to go through hundreds of millions of moves, how to learn from real data and how to deal with continuous control. From there, the panel discussion with Marcel Salathe, Chris Bishop, Raia Hadsell, Joanna Bryson and Martin Vetterli (president of the EPFL) touched on fascinating topics. I will simply highlight a few observations. Chris advocated for promoting the benefits of ML to balance the FUD that somewhat permeates the wider public discourse. Joanna Bryson (whom I have not mentioned; she works on extremely interesting ethical aspects of ML e.g. The Legal Lacuna of Synthetic Persons) argued for the importance of regulation and the pitfalls of disintermediating humans in certain decision loops. Raia pointed to the mismatch between societal problems and those ML-talent get to work on. These are the key remarks as Ness expands its partnerships around the way people and machines interact, as well as how people do business together. ML is bound to find its way in our solutions and the onus is on us to find net-positive uses of these powerful techniques for the people involved.