Project Stage 1

Project Stage I - Named Entity Recognition

Dataset

A dataset of news articles from BBC was used for the project. About 386 documents from Entertainment section were extracted from the above link. 330 of these were annotated and used for this project.

The raw dataset and the annotated dataset can be found below along with the dev and test splits:

Raw
- Set I - Dev Set
- Set J - Test Set
Annotated
- Set I - Dev Set
- Set J - Test Set

Source Code

Link

Results

On the test set, our model achieved the following results:

Precision	Recall	F1 Score
0.9140	0.8878	0.9007

Project Stage I - Named Entity Recognition

Dataset

Source Code

Results

Other documents