Anomaly detection in search engine logs

Assignment: A system capable of detecting anomalies in data produced by web search engine.

Input data

Web search engine produces and stores large amount of metadata providing detailed information about processed search requests. This information could be used to estimate the current state and load of the search engine itself and other components involved.

Anomalies in the trends of these data can indicate problems such as broken components, failures of external services, or failure of the logging itself.

Our solution 

After analyzing the data logs, we have identified important variables and designed statistical aggregations that highlight potential anomalies. 

We have implemented algorithms to remove seasonality and noise from the aggregated data and developed a set of detectors to find anomalies in the processed data series.

Finally, we have tuned the detectors’ parameters to optimize the accuracy of the detection.

The result

We’ve developed the anomaly detector that will allow to detect anomalies in real time to alert the operators of possible problems.

About Seznam.cz

Seznam.cz is one of the top czech IT and media companies, and runs the second most used web search engine in Czech Republic.

Leave a Reply

Your email address will not be published. Required fields are marked *