I attended the first edition of the “Applied Machine Learning Days” which took place at the Federal Engineering School of Lausanne (EPFL, Switzerland).
The event involved two days of talks and tutorials on Machine learning and Artificial Intelligence, organized by Prof. Marcel Salathé and his team of the Lab of Digital Epidemiology that gathered 450 academics and practitioners around a variety of topics including healthcare, satellite image processing, social sciences, telecommunications and online media.
It was also an opportunity to announce winners on crowdAI, an exciting open platform for machine learning competitions.
I am sharing some interesting highlights of this dense and well-organized event.
Peter Warden of Google presented TensorFlow for Poets. His project, inspired by a tutorial on EC2, allows non-experts in deep learning to get quickly up-and-running with one of the leading packages for Recurrent Neural Networks or Convolutional Neural Networks. During his presentation, Peter highlighted the benefits of Transfer Learning. One popular application of TensorFlow is to recognize images. With Transfer Learning, users can start from a pre-trained network which knows already about certain image features. He leverages Docker, a popular container technology we frequently come across in our software labs.
Swisscom, the leading mobile operator in Switzerland presented its incident detection toolbox catering to its various business units including core network or IPTV. They provide an integrated set of tools around notification (what is the signal), incident routing (classification) and incident prediction to help for example with agent onboarding finding similar tickets or provide IPTV services for example with sentiment analysis.
An interesting effort developed by Professor Salathé’s lab in cooperation with WikiMedia focuses on recommendations of missing content on WikiPedia across all languages. The team is experimenting with GapFinder. The web application allows you to find topics which are not covered typically in non-english versions of the online encyclopedia. It is still a prototype and will help contributors add missing pages to the fabulous resources.
Juergen Schmidhuber, a pioneer in neural network research and applications, presented some milestones taken from a paper on the history of deep learning. He started with Long Short Term Memory or LSTM which is used when you summon your Android phone with “OK Google”. The presenter also paid tribute to early pioneers of self-driving technology in the 1980s such as Ernst Dickmanns.
I was very excited to also hear probabilistic graphical models being addressed by Prof. Marloes Maathuis in her work on causal effects from observed data. Probabilistic graphical models form a rich toolset combining data structures from computer science such as directed acyclic graphs with probability theory, in particular conditional probabilities, the chain rule and Bayes’s rule. The work started by computer scientist and philosopher, Judea Pearl in the eighties finds applications in machine learning including virtual assistants, guiding systems or image processing for example.
Social network giant, Facebook also presented fastText. The fastText library is a key component for understanding text. Facebook data scientists are working on pre-trained word representation (word2vec), its scalability to large amounts of text and finally compression of the fastText’s data structures to small footprints for phones and micro-controllers.
Google talked about the research agenda of the largest R&D lab outside of the US, based in Switzerland. Google is working hard at natural language understanding by constructing a model of the world along with prior beliefs which will allow the computer to notice that a cow on top of an airplane is not realistic and probably meant as a joke.
Additionally, Animashree Anandkumar of AWS and long-time advocate of tensor mathematics in machine learning presented mxnet and its use on P2-class machines on Amazon’s cloud.
In general, there’s a strong desire by the machine learning community to make these approaches more accessible to all. One of the talks also addressed challenges of distributed computing and possible approaches to scale problems typically solved via gradient-descent with an alternative more suited for large clusters of parallel workers. The CoCoA project (communication-efficient distributed coordinate ascent) has also caught the attention of Apache Flink folks with an SVM implementation based on it.
This was a terrific event! I am glad some of the talks touched on data plumbing and infrastructure aspects of machine learning which can be a huge impediment for scientists eager to experiment and rapidly in a continuous delivery mode, which is something that remains top of our minds at Ness. Next year’s event was already announced and I already look forward to attending!