When I was a kid, Sesame Street was a big thing. One of the skits involved a group of kids. The kids were wearing solid-color shirts except for one kid, who was wearing a striped shirt. The name of the skit was, "Which one of these is not like the others?"
Anomaly detection does something similar, but on a much larger scale. It analyzes a dataset and looks for inconsistencies, outliers, and other data points that are much different from the others. For example, elevators need to be constantly monitored to ensure safe operation. If sensors are installed that record data at regular intervals, an anomaly detector algorithm can anticipate potential problems by notifying maintenance when data points deviate from the norm. It's important to note that the data points should be at regular intervals—in other words, the dataset is a time series.
Microsoft Azure Cognitive Services offers the Anomaly Detector service with a pre-trained anomaly detection machine learning model behind a REST API. You don't have to know anything about anomaly detection to use it. All you do is point the API at your dataset and then it does the rest. It's one of the simpler products in Azure Cognitive Service. This guide will demonstrate how to get started using the C# programming language.
There are two ways to access the Anomaly Detector service. First is with a language specific client library such as C#, JavaScript, or Python. This is the method you'll see in this guide. But there is also a language-agnostic REST API endpoint that you can leverage. The client library makes it much more straightforward. In either case, you're going to need an API key.
To get an API key, create an instance of the Anomaly Detector resource in the Azure Portal. Search for "Anomaly Detector" in the Azure Marketplace. Click the blue Create button and fill out the following form. The only item specific to the Anomaly Detector is Pricing tier. For experiments, the Free F0 tier is plenty, but for production applications, you'll want to use the Standard tier. It's paid but comes with higher quotas and a better SLA.
Click the Keys and Endpoint link in the resource and make note of one of the API keys as well as the endpoint for the web service.
Using the client libraries is similar to other Azure Cognitive Services products. The Anomaly Detector client libraries live in the Microsoft.Azure.CognitiveServices.AnomalyDetector
NuGet package. This is for a C# project. For Python, there is a pip
package, and for node.js a package in the npm registry.
Create a new ApiKeyServiceClientCredentials
and pass it the API key from the Azure Portal.
1using Microsoft.Azure.CognitiveServices.AnomalyDetector;
2
3// ...
4
5var credentials = new ApiKeyServiceClientCredentials(API_KEY);
And use the credentials
to create an AnomalyDetectorClient
.
1var client = new AnomalyDetectorClient(credentials);
Finally, set the Endpoint
of the client to the URL from the Azure Portal.
1client.Endpoint = SERVICE_ENDPOINT;
When using the REST API directly, the API key would be sent in the HTTP headers with the request.
The Anomaly Detector service expects the dataset to be in a particular format. The client libraries include a class named Point
that represents a single data point. The Point
class contains two properties: a Timestamp
and a floating point Value
. (Take care not to accidentally import the System.Drawing
namespace as it also has a class named the same.) Again, the dataset points are time-series data. The service accepts datasets of length between 12 and 8640 data points inclusive. Here the example extracts the timestamps and values from a JSON file.
1var points = new List<Point>();
2
3foreach (var item in data) {
4 var p = new Point {
5 TimeStamp = new DateTime(item.year, item.month, item.year),
6 Value = item.temperature
7 };
8 points.Add(p);
9}
The Point
classes are contained in a Request
that is sent to the service. Both the Point
and Request
classes can be found in the Microsoft.Azure.CognitiveServices.AnomalyDetector.Models
namespace. The timestamps in the datapoints should occur at regular intervals such as every day, every hour or every minute and must be in ascending order. The Granularity
property of the Request
is an enum
setting the interval. If the timestamps are a multiple of one of these intervals, that can be set with the CustomInterval
property. If the timestamps do not match the Granularity
and/or CustomInterval
properties, the application will throw an exception.
1var rq = new Request {
2 Series = points,
3 Granularity = Granularity.Daily,
4 CustomInterval = 2 // every other day
5};
The other property that is often used is the Sensitivity
. This is an integer, between 0 and 99 inclusive specifying how strict the service should be classifying a data point as an anomaly. The service will be more relaxed with higher values and more restrictive as the Sensitivity
decreases.
1rq.Sensitivity = 40;
To detect anomalies with the libraries, you call one of two methods on the client object. The first EntireDetectAsync
is for batch detection. The method accepts a request returns an EntireDetectResponse
with all of the detected anomalies in the IsAnomaly
property, which is a List
of bool
values. Each element in the List
corresponds to a Point
in the request. A value of true
indicates a Point
that was predicted to be an anomaly.
1var entireResponse = await client.EntireDetectAsync(rq);
2Console.WriteLine($"Azure found {entireResponse.IsAnomaly.Count(item => item == true)} anomaly/anomalies");
The second method is LastDetectAsync
. It also accepts a request but returns a LastDetectResponse
object. The LastDetectAsync
method examines the last data point in the dataset and predicts whether it is an anomaly. This is used for real-time anomaly detection in streamed data. The response object has an IsAnomaly
property, but this time it is a single bool
value and true
again indicates a predicted anomaly.
1var lastResponse = await client.LastDetectAsync(rq);
2Console.WriteLine($"The most recent data point is {(lastResponse.IsAnomaly ? "" : "not ")}an anomaly");
Given the very simple dataset:
1List<int> data = new List<int>()
2{
3 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1
4};
As you can see, the value 0 outnumbers the value 1. Thus 1 would be considered out of place. The result of the EntireDetectAsync
method would detect four to five anomalies. Also there is a 1 at the end of the dataset. The LastDetectAsync
method would detect the last data point as an anomaly. A real-world dataset would be much more complex, but this example makes the anomalies obvious.
The Azure Cognitive Service Anomaly Detector predicts whether values in a dataset are outliers or different from the rest of the data points. It is implemented as a REST API so you don't have to know anything about anomaly detection. Whether you use the client libraries described in this guide or the API endpoints, you'll be able to detect anomalies in batches or from streaming data. And you can adjust the sensitivity of the algorithm to permit the right range of anomalies for your application. Of course, how the anomalies are interpreted is outside the scope of the service. Azure is very capable, but it can't read your mind.
Thanks for reading!