Connect IT/OT Using Advanced Analytics for Machine Learning Success
Published on : Wednesday 05-01-2022
Advanced analytics incorporating machine learning helps users create insights for better and faster decision making, say Ashwin Venkat and Mark Derbecker.

Process manufacturers are increasingly making machine learning a top priority—and for good reason. With machine learning models, these organisations can simulate processes and predict outcomes, drive continuous improvement cycles, and enable proactive equipment maintenance and optimisation. That’s something to write home about, especially when the savings and efficiencies realised by applying these models can amount to millions of dollars in just a short amount of time.
However, to enable effective machine learning, manufacturers must do more than simply hire data scientists or find the right algorithm. Applying machine learning in an industrial setting requires participation from engineers and plant operations personnel because these are the team members that understand the organisation’s processes and associated sensor data, which are both critical to the machine learning effort. These are the people who need to understand and trust the predictions the model creates, and they need to carefully vet changes and verify results before the model is deployed.
When faced with these scenarios, information technology (IT) personnel, such as system administrators and data scientists, bring their algorithm and software knowledge to the table, which often includes programming skills. Operation technology (OT) personnel, such as operations managers and chemical engineers, provide real-world manufacturing and process knowledge. To bring these distinct skillsets together, process manufacturers must create an environment where the two groups can collaborate effectively. Without this, promising machine learning initiatives will likely end as frustrating failures.
To combat these frustrations, process manufacturers should look to advanced analytics applications that enable the IT/OT teamwork needed to drive successful machine learning efforts.
Eliminating the IT/OT divide

Process manufacturers know that the gap between IT and OT departments is nothing new. Often, data is trapped in OT data siloes that IT employees have difficulty accessing. And while OT data is often rich for use in machine learning models, machine learning algorithms often fall outside of plant and central group engineers’ area of expertise. Instead, this group prefers no-code solutions or tools with intuitive interfaces that can easily be plugged into their existing workflows.
In contrast, data scientists are comfortable with coding and algorithm development, but many are relatively new to the manufacturing sector. This means they can create algorithms but lack the manufacturing domain expertise, and consequently don’t understand all the data and context. This lack of understanding is particularly troubling when it comes to process data, which is dirty and often lacks context, which can lead to incorrect insights.
It’s clear that both sides need each other to complete successful machine learning efforts. Data science teams frequently feel their models are challenging to explain to operators and engineers, yet without collaboration these plant employees distrust the models. Even when data science outputs are accurate, plant employees are still sceptical because they are the ones responsible for explaining results and associated actions to operators, managers, and other stakeholders. This often leads to solutions developed by data science groups stagnating in the prototype phase due to a lack of the buy-in required to move forward.
Choosing the right solution is key for machine learning success
Advanced analytics applications that integrate the IT and OT worlds give machine learning initiatives a much higher success rate. These applications, such as Seeq, can plug into both the established OT world of process management, and the emerging world of data science and cloud technologies from the IT world.
When considering an advanced analytics application to help drive machine learning success, process manufacturers should first consider how it connects to underlying data sources, such as historians, MES, LIMS, maintenance systems, or any other data sources. Streamlined access to a combination of relevant data will enable a rich contextual foundation to perform analyses (Figure 1).
Next, they should consider how the application addresses a key problem: industrial data preparation. Everyone is familiar with the “garbage in/garbage out” axiom, but due to a lack of operations knowledge, some data scientists may find it difficult to determine if data is bad. Additionally, the relevance of data can change in a matter of minutes.
Therefore, knowledge capture from engineers who understand the plant and its processes is critical to confirming the integrity and relevance of the data being modelled. Applications that enable engineers to access data, cleanse and contextualise it—and then collaborate with data scientists in multiple ways—ensure optimal outcomes by empowering process manufacturers to perform data preparation quickly and easily.
Lastly, process manufacturers must consider how the application integrates with other tools. As more algorithms become available for use in a manufacturing context from a variety of sources, some data science teams will want to use their tools of choice to develop and use their algorithms, such as Azure Machine Learning Studio (Figure 2), AWS SageMaker, or Anaconda.
Others will turn to third-party services, including Amazon Lookout for Equipment and Azure AutoML, for algorithms that address vertical and asset-specific issues associated with distinct areas of the value chain.
There is also the growing open-source ecosystem of libraries available in Github and other repositories, including algorithms from software companies making their IP available for end users to review and tune to address their specific needs. Many university research groups are also embracing the open-source model to enable the wider community of OT users to easily access, influence, and generate value from newly developed technology. Recent examples include SysID from Brigham Young University and Stiction Analyzer from HAW Hamburg, both available as open source Seeq Add-ons (https://seeq12.github.io/gallery/).
Some advanced analytics applications, like Seeq, allow data scientists to create and use a portfolio of algorithms from any of these sources. This provides engineering teams with easy access to review, critique and influence data science developments, and enhances collaboration between data science and engineering teams. Over time and through an iterative set of joint efforts and experiments, the plausibility and accuracy of these models improves. Ultimately, trust and rapport are built between these groups as both sides come to understand that they must rely on each other to ensure machine learning success.
Driving better outcomes with IT/OT collaboration

Advanced analytics applications that enable increased IT/OT collaboration can drive increased machine learning value at an industry level. The level of collaboration among experts that comes from the freedom to deploy proprietary algorithms, third-party, or open-source options opens new opportunities to overcome common challenges.
Take petrochemical companies, for example. All petrochemical companies are required to meet EPA emissions standards. With access to off-the-shelf algorithms that can be easily deployed—while still having the flexibility to maintain confidentiality for proprietary measurements and calculations—these types of applications help solve an industry-wide challenge. With collaborative capabilities, expertise and algorithms can come from anywhere.
Here are a few use case examples illustrating collaborative IT/OT efforts.
Use cases
A large oil and gas company used Seeq to develop a solution for monitoring a critical piece of equipment. From the OT side, the company’s operations team uses the application to understand the operating ranges that improve performance output while maximising the lifetime of the asset. From the IT side, the application enables the company’s data scientists to operationalise and distribute their internally-developed solution for predictive maintenance and what-if analysis.
A pharmaceutical company’s data science team was previously using an internally-developed and unsupervised learning algorithm created using Python to proactively detect sensor drift in sensitive batch processes. However, their standard workflow required multiple, time-consuming steps, including uploading the data, manually cleansing it, exporting the file to the data science team, processing the data through the Python script, and then sending the data back to the team at the site. They took the same algorithm created by the data science team and plugged it into Seeq to operationalise the model and enable more efficient collaboration between their engineers and the data science team. Over time, the two-way interaction has built trust between the teams, helping them sustain and improve the model.
A specialty chemical manufacturer wanted to build an accurate forecast of product quality disposition, however it first needed to determine which measured and manipulated variables had the greatest impact on the target signal. By creating a correlation algorithm and deploying it as an add-on tool in Seeq, the company identified the input signals with greatest effect on product quality (Figure 3).
Additionally, the algorithm automatically calculated the process dynamic lags between the upstream signals and the target variable. A reduced number of signals, with appropriate time delays, was pushed back to Seeq, where a predictive model was deployed. Engineers then validated the model against historical data and found it accurately predicted more than 90% of quality deviations. The manufacturer adopted this new model-based control scheme and now saves more than $500,000 per year previously expended due to quality downgrades.
Conclusion
Machine learning is a critical innovation for process manufacturers looking to increase efficiencies, especially as they deal with more data and increased pressure to improve outcomes. In the past, efforts were often hindered by a lack of collaboration between data scientists and process experts, particularly when applying machine learning algorithms to solve process control problems.
To address this and other issues, an integrated approach to machine learning using advanced analytics applications enables operations and data science teams to work together. By providing access to all data sources, the ability to cleanse and contextualise data, and real-time collaboration capabilities, these applications allow process manufacturers to capitalise on all IT and OT team members’ strengths.
All figures courtesy of Seeq

Ashwin Venkat is the Senior Principal Scientist and Team Lead for the Advanced Analytics/Machine Learning (ML) group at Seeq. He has nearly 20 years of experience developing, implementing, and commercializing new analytics and ML technology. Prior to joining Seeq, he served as an Application Architect for a predictive analytics/ML startup. Venkat has also served in roles at Shell and Chevron, both focused on developing and implementing technology for improving the performance of manufacturing assets.

Mark Derbecker is the Chief Product Officer of Seeq. He previously served as VP of Engineering of Seeq, where he led the product development effort from 2013 to 2021. Derbecker has more than 20 years of experience with technology and startups. Prior to joining Seeq, Derbecker served as Director of Systems Development for aerospace startup Insitu, and before that was a software development engineer with Microsoft working on Windows and Xbox product development.