Machine learning is excellent for uncovering trends in data
Published on : Friday 01-01-2021
Seth DeLand, Application Manager at MathWorks for Data Analytics.

More and more businesses are talking about using machine learning, but what is the present status of applications across business verticals?
The status of applications varies significantly across verticals. While machine learning has matured in applications related to marketing and sales, in places like R&D and manufacturing operations, most companies are just beginning to realise the return on investment they can get from machine learning. One of the critical applications of machine learning that has emerged in these areas is predictive maintenance – the use of machine learning to predict when a piece of equipment needs maintenance. But the successful deployment of predictive maintenance takes time: you need to collect sensor data from the equipment, use that data to build and validate predictive models, then test those algorithms alongside existing operations to gain confidence that the model works properly. All of this takes time – much more time than building a machine learning model that can predict whether or not someone will click on an advertisement on a website – but we see more and more companies begin projects to incorporate predictive maintenance into their equipment/operations.
What are the prerequisites to start the process of machine learning in a business?
From a technical perspective, the only requirements are data and software that can be used to build machine learning models. But two non-technical requirements are just as crucial to a successful machine learning project:
i. A clear problem statement that guides the application of machine learning. Yes, machine learning is excellent for uncovering trends in data, but it cannot tell you which problem you should be solving. A concrete problem statement will guide the collection of data and provide current benchmarks so that the performance of the machine learning model can be assessed.
ii. Domain expertise about the problem being solved. Data scientists are skilled at applying machine learning techniques to data. Still, without domain knowledge of the problem being solved, it is impossible to distinguish correlation from causation or to distinguish signal from noise. This can be addressed by either pairing the data scientists up with someone who has an intimate understanding of the system (such as a process engineer or R&D engineer) or by empowering those with the domain expertise to apply machine learning themselves. The latter has been our primary focus with MATLAB – to enable engineers to apply machine learning using point-and-click apps designed for those with little to no machine learning expertise.
How reliable are present methods for turning raw data into useful information, which is crucial for ML?
One of the significant challenges with machine learning is that there are lots of different types of machine learning models out there, and it’s impossible to know which one is going to be best suited for a particular problem. So, it may be that there is a method out there that is quite reliable for a given dataset but finding that method can be nontrivial. This can be quite intimidating for a new project, as initial results may not be representative of the true potential of machine learning. In my experience, the best thing to do is to look for similar successful applications and use those as a starting point. Rather than starting with a blog post about how machine learning is used to recommend what movie someone should watch next, you might instead look for examples of machine learning applied to sensor data from industrial machinery. In support of that, we have developed reference examples in MATLAB for predictive maintenance that show how to apply machine learning methods to various pieces of equipment such as bearings, gearboxes, pumps, motors, and batteries.
What are the different machine learning techniques and processes, and the criteria to be followed in selecting the most appropriate for a given business?
Large investments in machine learning research have resulted in a wide variety of techniques tailored to different types of problems. At the highest level, there are three different types of machine learning techniques:
1. Supervised Learning: Involves training a model on input and output data, where the goal is to create a model that can predict future output values. This is perhaps the most mature area of machine learning and involves popular applications like forecasting, image classification, and predictive maintenance.
2. Unsupervised Learning: Involves training a model to historical data that can be used to infer patterns in that data. The most popular application of unsupervised learning is clustering, where machine learning techniques are used to identify distinct clusters or groups in historical data. This is commonly used in applications like marketing, where customers are segmented into different sets based on having similar attributes.
3. Reinforcement Learning: Involves training a model that can make real-time decisions for controlling a dynamic system. In this case, models are trained using data generated dynamically from simulation models. A typical application of reinforcement learning in robotics, where models are used for complex tasks such as trajectory planning, and teaching behaviours, such as locomotion.
The critical criteria for selecting the most appropriate approach will be the application and available data. For systems where there is an input/output relationship, and it is desired to predict the output, supervised learning techniques make sense. When there are unlabelled sensor data, and the goal is to identify different operating modes or anomalies, unsupervised learning techniques are the right choice. If the goal is to create a model that can control a dynamic system, then reinforcement learning is an excellent place to start.
The manufacturing sector still has inefficiencies, even as machines are becoming smarter. How can machine learning help in such cases?

One of the side effects of machines becoming smarter is that there are more sensors and telemetry collecting data from the devices. This data, when stored in an organised manner, creates an opportunity for machine learning to be applied to improve product quality, improve yield, or prevent downtime. Without this data, machine learning isn’t useful.
The manufacturing sector also needs to make sure that it applies machine learning in places where it stands to have a significant impact on these inefficiencies. The return on investment from a machine learning project should be considered when prioritising which projects should move forward. It makes sense to prioritise applications such as predictive maintenance, where there are already many stories of successfully reducing downtime.
What are the variables – that crucial ‘X’ – on which depends the success of machine learning?
In my experience, one of the most critical, but often overlooked, variables is having relevant domain expertise about the system/process that machine learning is being applied to. Without this domain expertise, too often, companies will overlook important details and end up with a model that is meaningless or not useful. Folks that have domain expertise can look at the raw data and quickly identify whether something is interesting going on, or whether there is just ‘some normal part of the process’. What looks like an exciting finding to a data scientist might be quickly identified as a change in operating mode by a domain expert. As mentioned earlier, this domain expertise is commonly found in the engineering teams who build and maintain such systems. Empowering those engineers to combine their domain expertise with machine learning is the key to success.