Project Stage I - Named Entity Recognition
Dataset
A dataset of news articles from BBC was used for the project. About 386 documents from Entertainment section were extracted from the above link. 330 of these were annotated and used for this project.
The raw dataset and the annotated dataset can be found below along with the dev and test splits:
Source Code
Results
On the test set, our model achieved the following results:
Precision | Recall | F1 Score |
---|---|---|
0.9140 | 0.8878 | 0.9007 |