Join our webinar - "Harness Automation to Transform Your IVR Testing" on May 30, 2024 | Register now.

Future-proof Your AI at the Edge with AWS

In the rapidly evolving field of IoT in manufacturing and transportation domains, machine learning features create significant value. By utilizing machine learning at the edge, manufacturers can gather insights faster, identify trends, identify patterns, and detect anomalies, all resulting in enhanced security and safety and cost savings.

Read the full article here.

Bringing the Edge to Life: 3 Key Elements of Continuous Machine Learning in the IoT

The Internet of Things (IoT) has revolutionized how we interact with technology, connecting countless devices and generating massive amounts of data. To unlock the full potential of IoT, continuous machine learning (ML) on the edge has emerged as a game-changer. Organizations can leverage real-time insights and make more informed decisions by bringing intelligence closer to the data source. This blog will explore three key elements that drive continuous ML on the edge of the IoT ecosystem.

The IoT Edge seamlessly integrates with ML frameworks, enabling developers to leverage predictive capabilities at the network’s edge. Artificial intelligence computing can be incorporated into the IoT Edge for real-time analysis and decision-making, reducing latency and optimizing network resource utilization. It expands possibilities for intelligent and adaptive IoT applications, unlocking the full potential of continuous machine learning in the IoT ecosystem.

Let’s look at the three key elements of implementing machine learning on the IoT Edge:

Machine Learning Model Design At the heart of continuous machine learning on the edge is the ability to analyze vast amounts of data and design the most suitable ML model. With the diverse range of IoT devices and the heterogeneity of data generated, it becomes crucial to perform effective data analysis. It involves preprocessing, feature engineering, and data visualization to gain meaningful insights.

Once the data is understood, selecting the right ML model, such as decision trees, support vector machines, or neural networks, becomes essential. The utilization of edge artificial intelligence in the context of the IoT opens a wide range of possibilities and use cases:

  • Real-time anomaly detection – ML models deployed on edge devices can continuously monitor sensor data and identify abnormal patterns or behaviors. It rapidly detects anomalies, such as equipment malfunctions or security breaches, enabling timely interventions.
  • Predictive maintenance – Computing AI algorithms that run on the edge can analyze sensory data to predict equipment failures or maintenance needs. Organizations can optimize maintenance schedules, reduce downtime, and improve operational efficiency by identifying potential issues in advance.
  • Image recognition – Real-time analysis and identification of objects, scenes, or patterns enables instant decision-making with privacy preservation. This makes it ideal for applications such as autonomous vehicles, surveillance systems, or industrial quality control.

When it comes to analyzing sensory data for further ML model training, various approaches can be employed:

Training an ML model is critical in teaching the model to recognize patterns and make predictions based on input data. In continuous machine learning in IoT, the training process typically begins with a dataset comprising input features and corresponding target values. The basic steps of ML model training include:

  • Splitting the dataset into training and validation sets, allowing for evaluation of the model’s performance on unseen data.
  • Selecting an appropriate ML algorithm or model architecture based on a specific problem.
    Starting the training process with random parameters. And iteratively updating these parameters to minimize a predefined loss function, which measures the discrepancy between the model’s predictions and true target values.
  • Iterating the training process a defined number of times or until it meets the convergence criterion.
  • Evaluating the trained ML model using a validation set to assess its generalization performance. If the model meets the desired performance criteria, it can deploy for inference and prediction tasks on new, unseen data.


1Inference on the Edge


Lightweight ML inference framework

Efficient real-time ML inference on edge devices requires a lightweight model inference framework. Advances in techniques like model compression, quantization, and knowledge distillation have enabled the development of optimized models for ML edge deployment. These lightweight models balance complexity and performance, offering low latency, reduced memory footprint, and energy-efficient ML inference at the edge.

Several popular frameworks are available for edge ML inference specifically designed to facilitate efficient deployment and execution of edge machine learning models. Some of the main frameworks for edge ML inference include:

Lightweight version of the popular TensorFlow framework optimized for mobile and edge devices. It provides tools and libraries for deploying and running ML models on ML edge devices with resource constraints.

Optimizes and compiles machine learning models to run efficiently on various edge devices. By leveraging Neo, developers can achieve high-performance inference with a reduced memory footprint and lower latency.

Open Neural Network Exchange (ONNX) Runtime is a high-performance inference engine that supports running ML models across various frameworks. It offers efficient execution of ML models on edge devices and supports multiple hardware platforms.

Apache MXNet is an open-source deep learning framework that supports edge deployment through MXNet Model Server. It allows running MXNet models on edge devices and provides a scalable and efficient inference solution. These frameworks provide developers with the tools and libraries to optimize, deploy, and execute ML models on edge devices, enabling efficient and real-time inference at the edge of the IoT network.

Automated model retraining and distribution

Automated model retraining and distribution are crucial for maintaining up-to-date and adaptable artificial intelligence computing models. MLOps automates the entire ML lifecycle, including data collection, deployment, and monitoring. Due to evolving data streams, retraining IoT Edge ML models is essential in the dynamic IoT environment. Automated pipelines ensure models develop with the changing data landscape. Efficient model distribution enables seamless deployment across edge devices, providing consistent performance and scalability.



2ML Model Retraining


MLOps manages data flow between ML edge devices and the cloud, enabling continuous model improvement. Edge device inference data is collected, aggregated, compressed, and transmitted to the cloud for analysis and retraining. Data scientists strategize data utilization, identifying patterns and determining retraining needs for up-to-date and effective ML models in the dynamic IoT environment. Verification, approval, and distribution is integral to the continuous ML process. New model versions undergo rigorous verification, ensuring compliance and alignment with objectives. Approved models are securely distributed to ML edge devices, maintaining reliability and adhering to quality standards for continuous ML operations at the edge.


Continuous machine learning on the edge brings unprecedented possibilities to the IoT landscape. By focusing on data analysis and designing the most suitable ML model, implementing MLOps for automated model retraining and distribution, and leveraging lightweight model inference frameworks, organizations can unlock the full potential of the IoT ecosystem. These three elements drive real-time insights, enhanced decision-making, and improved operational efficiency. As IoT evolves, continuous machine learning on the edge will play an increasingly vital role in shaping a smarter and more connected world.

How Generative AI Will Improve Incident Response Systems in the Manufacturing Industry

Incident Response (IR) systems in the manufacturing industry have significantly evolved over the decades. In the past, these systems were manual and reactive, relying on human observation and intervention. Incident detection was inefficient as it depended on physical inspections and manual reporting. Response to incidents was often delayed due to the time it takes to identify, report, and communicate an incident.

In contrast, modern IR systems are proactive, automated, and data-driven. They leverage advanced technologies such as sensors, IoT devices, and machine learning to detect incidents as they happen and even predict potential incidents based on patterns and trends. Proactive, instantaneous communication of incidents with automated alerts and notifications improves speed of resolution and customer satisfaction.

The most significant difference between legacy and modern systems is their ability to prevent incidents before they occur, their effectiveness, and user-friendliness. These aspects can be further improved by leveraging cutting-edge Generative AI to transform IR systems into interactive copilots that enable personnel to quickly address new incidents, learn from the past ones, proactively predict issues, and create powerful knowledge bases.

Retrieval Augmented Generation and Data Privacy

Generative AI can create human-like text by learning patterns from existing data. Models such as GPT-3 by OpenAI can write essays, answer questions, and summarize text by predicting the next word in a sentence based on the context provided by preceding terms.

Vector databases, on the other hand, are designed to store and query unstructured data (such as images, audio, and text) as high-dimensional vectors. It is achieved by embedding data into its vector representation, which machines understand. Vector Databases are particularly useful in machine learning and AI applications, where they can be used to find similarities between different pieces of data. Given a query vector, the database finds vectors that are closest to it in the embedding space. The similarity measure is typically based on a distance metric like Euclidean distance or cosine similarity.

Retrieval Augmented Generation (RAG) is a way to augment LLMs with additional data coming from a vector database. This can lead to the creation of powerful knowledge bases:

  1. The text is extracted from private data sources (GitHub, YouTube, PowerPoint presentations, text files, chats, etc.), split into chunks, and each piece is converted into an embedding vector by a Large Language Model (LLM) such as ChatGPT. It is graphically shown in the figure below:

  2. When it’s time to retrieve data from the knowledge base, the input prompt is embedded by the same LLM, the vector database retrieves similar vectors, and results are refined by the LLM model, which has access to conversation memory, taking into account what was previously asked. This is shown below:

However, leveraging this kind of system in the enterprise world was nearly impossible due to data privacy issues. Recently, solutions have been announced addressing multiple instances of someone leaking confidential data:

  1. Open-source models and C++ libraries can run these models locally. However, a big downside is their subpar performance compared to ChatGPT, as well as the price of hardware needed for their smooth run, especially on a large scale.
  2. Azure has incorporated OpenAI’s ChatGPT into their cloud offerings, specifically the OpenAI Service. They are now offering data privacy out of the box while keeping a pay-as-you-go payment system.

AI Copilot Systems

With cutting-edge technologies like Large Language Models (LLMs) and Vector Databases, it is possible to have an AI Copilot system, an intelligent assistant designed to aid and recommend actions in performing tasks and making decisions. Just like a copilot in an aircraft assists the pilot, an AI Copilot system assists users in navigating and completing complex tasks.

Microsoft is a clear leader on this front as they are standardizing the architecture for AI Copilots. They have incorporated this technology into their products, including GitHub and MS Office 365. We aim to include AI Copilot systems in our clients’ internal products to improve their offerings, increase productivity, and reduce the time of information flow throughout their organization.

Generative AI in Incident Response Systems

One of our clients, a leading product and service provider in the manufacturing industry, is involved in complex and potentially hazardous operations. A top-of-the-class IR system will bring significant benefits to the organization, such as improved safety, data-driven decision-making, and improved operations and safety practices by reviewing and learning from past incidents and their resolution.

In addition to a state-of-the-art IR system, an AI Copilot enables conversational experience with the current and past state of the system, thus streamlining day-to-day processes for engineers. Technologies like LangChain provide powerful ways to connect LLMs to various data sources, including SQL and NoSQL databases, and create powerful knowledge bases.

By leveraging Generative AI, our client’s IR system can recommend actions to engineers to resolve an incident based on responses to past incidents, predict the likely outcomes from different actions, and help them prioritize responses to multiple incidents. The IR system will include interactive dashboards that provide real-time information about incidents and will be voice-command powered, thus making them more user-friendly and efficient.

IR system will be able to analyze data from various IoT devices used in the railway industry, providing a comprehensive picture of an incident. In addition, it will personalize the user experience by learning individual users’ preferences and adapting its interaction style accordingly.

We will be able to connect complex data sources to knowledge bases that LLMs can query by using image-to-text models to describe the content of images and pair them with Computer Vision models to provide an in-depth analysis of incident images. Similarly, we plan on using video classification models to detect issues and accidents in real time and pair them with transcription models to get text data out of videos into vector databases.

Generative AI is not only a buzzword; it provides powerful tools to transform IR systems. With the integration of GenAI, engineers will streamline their response to minor incidents and major accidents and learn from past occurrences. These tools will transform IR systems into an interactive, personalized platform where engineers can find relevant information instantly, providing data-driven insights and recommendations on resolving incidents.