August 23, 2020

Data augmentation with transformer models for named entity recognition

Language model based pre-trained models such as BERT have provided significant gains across different NLP tasks. For many NLP tasks, labeled training data is scarce and acquiring them is a expensive and demanding task. Data augmentation can help increasing the data efficiency by artificially perturbing the labeled training samples to increase the absolute number of available data points. In NLP this is commonly achieved by replacing words by synonyms based on dictionaries or translating to a different language and back1. Read more

June 8, 2019

Interpretable named entity recognition with keras and LIME

In the previous posts, we saw how to build strong and versatile named entity recognition systems and how to properly evaluate them. But often you want to understand your model beyond the metrics. So in this tutorial I will show you how you can build an explainable and interpretable NER system with keras and the LIME algorithm. What does explainable mean? Deep neural networks are quite successful in many use-cases, but these models can be hard to debug and to understand what’s going on. Read more

December 26, 2018

Text analysis with named entity recognition

This is the second post of my series about understanding text datasets. If you read my blog regularly, you probably noticed quite some posts about named entity recognition. In this posts, we focused on finding the named entities and explored different techniques to do this. This time we use the named entities to get some information about our data set. We use the “Quora Insincere Questions Classification” dataset from kaggle. In this competition, Kagglers will develop models that identify and flag insincere questions. Read more

December 10, 2018

Named entity recognition with Bert

In 2018 we saw the rise of pretraining and finetuning in natural language processing. Large neural networks have been trained on general tasks like language modeling and then fine-tuned for classification tasks. One of the latest milestones in this development is the release of BERT. BERT is a model that broke several records for how well models can handle language-based tasks. If you want more details about the model and the pre-training, you find some resources at the end of this post. Read more

August 24, 2018

Evaluate sequence models in python

An important part of every machine learning project is the proper evaluation of the performance of the system. In this post we will talk about evaluation of token-based sequence models. This is especially tricky because: some entity types occur more often then others entities can span multiple tokens. The first problem is solved by picking the right metric. We will see what to kinds of metrics are suitable for this. Read more

July 1, 2018

State-of-the-art named entity recognition with residual LSTM and ELMo

This is the sixth post in my series about named entity recognition. This time I’m going to show you some cutting edge stuff. We will use a residual LSTM network together with ELMo embeddings, developed at Allen NLP. You will learn how to wrap a tensorflow hub pre-trained model to work with keras. The resulting model with give you state-of-the-art performance on the named entity recognition task.

April 15, 2018

Enhancing LSTMs with character embeddings for Named entity recognition

This is the fifth post in my series about named entity recognition. If you haven’t seen the last four, have a look now. The last time we used a CRF-LSTM to model the sequence structure of our sentences. We used the LSTM on word level and applied word embeddings. While this approach is straight forward and often yields strong results there are some potential shortcomings. If we haven’t seen a word a prediction time, we have to encode it as unknown and have to infer it’s meaning by it’s surrounding words. Read more

November 27, 2017

Sequence tagging with LSTM-CRFs

This is the fourth post in my series about named entity recognition. If you haven’t seen the last three, have a look now. The last time we used a recurrent neural network to model the sequence structure of our sentences. Now we use a hybrid approach combining a bidirectional LSTM model and a CRF model. This is a state-of-the-art approach to named entity recognition. Let’s recall the situation from the article about conditional random fields. Read more

October 22, 2017

Guide to sequence tagging with neural networks

This is the third post in my series about named entity recognition. If you haven’t seen the last two, have a look now. The last time we used a conditional random field to model the sequence structure of our sentences. This time we use a LSTM model to do the tagging. At the end of this guide, you will know how to use neural networks to tag sequences of words. For more details on neural nets and LSTM in particular, I suggest to read this excellent post. Read more

September 10, 2017

Named entity recognition with conditional random fields in python

This is the second post in my series about named entity recognition. If you haven’t seen the first one, have a look now. Last time we started by memorizing entities for words and then used a simple classification model to improve the results a bit. This model also used context properties and the structure of the word in question. But the results where not overwhelmingly good, so now we’re going to look into a more sophisticated algorithm, a so called conditional random field (CRF). Read more

August 26, 2017

Introduction to named entity recognition in python

In this post, I will introduce you to something called Named Entity Recognition (NER). NER is a part of natural language processing (NLP) and information retrieval (IR). The task in NER is to find the entity-type of words. Entities can, for example, be locations, time expressions or names. If you want to run the tutorial yourself, you can find the dataset here. Now we load it and peak at a few examples. Read more

Privacy Imprint

© depends-on-the-definition 2017-2020