Identifying a faulty turbine in a wind farm, which might require analyzing a whole lot of signals and hundreds of thousands of information points, is like on the lookout for a needle in a haystack.
Engineers often optimize this complex problem using deep learning models that may detect anomalies within the measurements taken repeatedly by each turbine over time (called time series data).
However, with a whole lot of wind turbines recording dozens of signals every hour, training a deep learning model to research time series data is dear and laborious. To make matters worse, the model may must be retrained after deployment and wind farm operators may lack the essential machine learning expertise.
In a brand new study, MIT researchers have found that giant language models (LLMs) have the potential to be more efficient anomaly detectors for time series data. Importantly, these pre-trained models are able to use right out of the box.
The researchers developed a framework called SigLLM, which incorporates a component that transforms time series data into text-based inputs that an LLM can process. A user can feed this prepared data into the model and tell it to start out detecting anomalies. The LLM will also be used to predict future time series data points as a part of an anomaly detection pipeline.
While LLMs couldn't beat state-of-the-art deep learning models at anomaly detection, they performed just in addition to another AI approaches. If researchers can improve the performance of LLMs, this framework could help engineers detect potential problems in equipment like heavy machinery or satellites before they occur, without having to coach an expensive deep learning model.
“Since this is simply the primary iteration, we didn’t expect to get there on the primary try, but these results show that there may be a chance here to leverage LLMs for complex anomaly detection tasks,” says Sarah Alnegheimish, a PhD student in Electrical Engineering and Computer Science (EECS) and lead creator of a paper on SigLLM.
Her co-authors include Linh Nguyen, an EECS student, Laure Berti-Equille, a research director on the French National Research Institute for Sustainable Development, and lead creator Kalyan Veeramachaneni, a senior scientist within the Information and Decision Systems Laboratory. The research might be presented on the IEEE Conference on Data Science and Advanced Analytics.
An ordinary solution
Large language models are autoregressive, meaning they’ll understand that the most recent values ​​in sequential data rely on previous values. For example, models like GPT-4 can predict the following word in a sentence based on the previous words.
Because time series data is sequential, the researchers thought that the autoregressive nature of LLMs might make them well suited to detecting anomalies in such a data.
However, they desired to develop a way that doesn’t require fine-tuning, a process through which engineers retrain a general LLM on a small amount of task-specific data to make it an authority at a specific task. Instead, the researchers use a ready-made LLM, with none additional training steps.
But before they might use it, that they had to convert time series data into text-based inputs that the language model could process.
This was achieved through a series of transformations that capture crucial parts of the time series while representing data using the fewest variety of tokens. Tokens are the fundamental inputs to an LLM, and more tokens require more computations.
“If you should not very careful with these steps, you could lose a vital a part of your data and lose that information,” says Alnegheimish.
After the researchers discovered easy methods to transform time series data, they developed two approaches to anomaly detection.
Approaches to anomaly detection
In the primary method, which they call “prompter,” they feed the prepared data into the model and ask it to locate anomalous values.
“We needed to do several iterations to determine the fitting prompts for a given time series. It's challenging to grasp how these LLMs ingest and process the info,” adds Alnegheimish.
In the second approach, called Detector, they use the LLM as a forecasting tool to predict the following value from a time series. The researchers compare the expected value with the actual value. A big discrepancy indicates that the actual value is probably going an anomaly.
With Detector, the LLM can be a part of an anomaly detection pipeline, while Prompter would do the job itself. In practice, Detector performed higher than Prompter, which generated many false positives.
“I feel with the prompter approach, we now have imposed too many hurdles on LLM students. We have given them a harder problem,” says Veeramachaneni.
When they compared each approaches to current techniques, Detector outperformed the transformer-based AI models on seven of the eleven datasets evaluated, although the LLM required neither training nor fine-tuning.
In the long run, an LLM could also provide plain language explanations for its predictions in order that an operator could higher understand why an LLM identified a specific data point as anomalous.
However, modern deep learning models performed significantly higher than LLMs, showing that there continues to be plenty of work to be done before an LLM will be used for anomaly detection.
“What must occur to make it perform in addition to these state-of-the-art models? That's the million-dollar query we're currently asking. An LLM-based anomaly detector must be a game-changer for us to make this effort worthwhile,” says Veeramachaneni.
In the long run, researchers hope to see if performance will be improved through fine-tuning, but this could require additional money and time, in addition to expertise for training.
Their LLM approaches take between half-hour and two hours to supply results, so increasing speed is a key area of ​​future work. The researchers also want to review LLMs to grasp how they detect anomalies, within the hope of finding a technique to improve their performance.
“When it involves complex tasks like detecting anomalies in time series, LLMs are really alternative. Maybe LLMs will also be used to tackle other complex tasks?” says Alnegheimish.
This research was supported by SES SA, Iberdrola and ScottishPower Renewables and Hyundai Motor Company.