Orignially published by Julide Oztap, Marketing Lead at Reality Analytics, Inc.
There are two main paradigms for solving classification and detection problems in sensor data: Model-driven, and Data-driven.
Model-Driven is the way everybody learned to do it in Engineering School. Start with a solid idea of how the physical system works – and by extension, how it can break. Consider the states or events you want to detect, and generate a hypothesis about what aspects of that might be detectable from the outside and what the target signal will look like. Come collected samples in the lab and try confirm a correlations between what you record and what you are trying to detect. Then engineer a detector by hand to find those hard won features out in the real world, automatically.
Data-Driven is a new way of thinking, enabled by machine learning. Find an algorithm that can spot connections and correlations that you may not even know to suspect. Turn it loose on the data. Magic follows. But only if you do it right.
Both approaches have their pluses and minuses:
1. Model-Driven approaches limit complexity
Figure 1 Model-driven method: Powerful but limited in complexity.
Model-driven approaches are powerful because they rely on a deep understanding of the system or process, and can benefit from scientifically established relationships.
Models can’t accommodate infinite complexity and generally must be simplified. They have trouble accounting for noisy data and unincluded variables. At some level they’re limited by the amount of complexity their inventors can hold in their heads.
2. Model-Driven is expensive and takes time
Who builds models? The engineers that understand the physical, mechanical, electronic, data flow, or other appropriate details of the complex system – in-house experts or consultants that work for a company and develop its products or operational machinery. These are generally experienced experts, very busy, and are both scarce and expensive resources. Furthermore, modeling takes time. It is inherently a trial-and-error approach, rooted in the old scientific method of theory-based hypothesis formation and experiment-based testing. Finding a suitable model and refining it until it produces the desired results is often a lengthy process.
3. Data-Driven is data hungry
Figure 2 Data-driven method: Decent if fed with a lot of data.
Data-Driven approaches based on machine learning require a good bit of data to get decent results. AI tools that discover features and train-up classifiers learn from examples, and there needs to be enough examples to cover the full range of expected variation and null cases.
Some tools (like our Reality AI) are powerful enough to generalize from limited training data and discover viable feature sets and decision criteria on their own, but many machine learning approaches require truly Big Data to get meaningful results and some demand their own type of experts to set them up.
About the author:
Julide Oztap works as Marketing Lead at Reality Analytics and is blogging on topics like AI, Machine Health, Internet of Things and Connected Devices. To learn more about the Reality AI data-driven methods, visit http://www.reality.ai/