How to build Natural Language Interface for your Database

Data-driven Business

In the last couple of years, there’s been a huge rise in the development of NLP technology, and for a good reason. With Natural Language Processing, one can build interfaces that allow users to interact with data more intuitively, using natural language. It can significantly increase the volume and value of insights in data analytics by democratizing access to data.

So how can one build a Natural Language Interface?

Query a Database in Natural Language

The Natural Language Interface should be able to understand human language by input and translate it into the language that databases understand in order to obtain the information you’re after.

And there are several approaches to creating a parser that would parse plain text message into some structured data type. For example, you can create a grammar-based parser and use an NLP algorithm to build that data structure. Another option -- if you already have plenty of parsed messages and all of | them are from one domain (like ‘transportation enquiry system’) -- is to try to train a neural network and use it for your further messages (these two approaches will be compared in our next article).

When designing grammar-based parser for an NLP interface one can use an open source natural language processing library. However, this approach has a number of drawbacks: the input query has to be a grammatically correct sentence, which is not always the case, since users tend to simplify their queries and use symbols not typical for natural language, thus creating a mixture of natural grammar and formal query language.

We at FriendlyData have solved this problem by creating a parsing module, which is based on the proprietary formal grammar technology that successfully copes with this task: the parsing module outputs a tree, specially modelled for query analysis purposes, which provides full control over the structure and makes post processing and query object translation easy and transparent. This enables FriendlyData to accept any query structure and parse both natural language and Google-like queries (semi-structured types of queries,) which drastically simplifies the requirements for search query formulation.

The next thing you should take into consideration when creating a parser is that ready-to-use open source NLP libraries contain only popular synonyms. And you should therefore ensure that the grammar format you create will let you add new synonyms later. This is where technology like FriendlyData comes in handy: it is based on a simple plain text grammar definition format that can easily be edited and compiled on the fly (adding new synonyms, data types and operations).

Last but not least, don’t forget about accuracy, because accuracy plays a critical role when it comes to using data for decision-making. Open source NLP parsers are not 100% accurate, and the accuracy drops dramatically in longer sentences, e.g. when the query contains a chain of 4 or more conditions. FriendlyData’s parsing module is specially modelled for query analysis purposes and is twice as accurate as its open source alternatives.

In general, building an in-house NLP interface is a challenging task: it requires a lot of time and investment, as well as experienced and talented data specialists. But the good news is that there is a quicker and easier way to build an advanced Natural Language Interface directly into your product or internal systems. This alternative is a ready-to-use solution -- FriendlyData API. Integration with FriendlyData is over 100 times faster than building an in-house NLP interface. All you need to do is make a request to a single endpoint and then display the obtained data.

Take advantage of FriendlyData and get your data to work for you!