Development of events classification system

Artificial Intelligence & Machine Learning

Web and mobile applications that process a large amount of data, gathered from many sources often require verification in order to maintain quality of the content. Along with the increase of processed data, manual verification stops being scalable. Solutions based on Artificial Intelligence and Natural Language Processing can help.

Ordnance Survey Leisure

Established: 2009
Industry: Geospatial
Size: 50 - 100
Location: United Kingdom

Ordnance Survey Leisure is a mapping agency that provides digital map data and location-based products for business, government, and consumers. One of their applications encourages users to take part in outdoor activities and events. For this purpose, the underlying platform integrates with many data providers and aggregates information about various events and activities including names and descriptions. However, the data providers stream a wide variety of events and these are sometimes out of the scope of the application. As a result they need to be filtered out, especially those that do not happen outside. We have leveraged state-of-the-art natural language processing methods to filter such events.

We proposed a system that automatically chooses events and activities, that should be kept in the application. We adjusted the event classifier to solve the client’s problem using pre-trained NLP models and transfer learning. Using a relatively small data sample, we achieved ~95% classification accuracy.

During the project, we  cooperated closely with the client to develop a mutual understanding of the business problem and the data landscape that is available. Thanks to our collaboration on every step of the lifecycle of the project, we managed to create a system that reflects the business needs of our client.

The state-of-the-art NLP models allow us to create innovative solutions that can automate the decision-making processes and could be leveraged in any industry that process text data, even if the available data for training the model is limited.

Technologies used: Hugging Face Transformers, Google BERT, Azure Machine Learning Studio, Python, sklearn, PostgreSQL


Interested in collaborating with us?
Get in touch.
Tomasz Smolarczyk
Head of Artificial Intelligence