Member-only story
Eurovision Song Analysis: Topic Analysis and NER
In this project, the songs competing in Eurovision were examined. The aim of the project is to identify the differences between the winning and losing songs and to offer suggestions accordingly. You can find full code from github repo.
Project steps:
- Data Preprocessing
- Topic Modelling
- Post-Processing
- Sentiment Analysis
- Named Entitiy Recognition
- Results
Data Preprocessing
Data preprocessing is an important step for NLP, as in other machine learning projects. As a data scientist, the smoother we can make the data, the better the results we get. Since we work with unstructured data, the data preprocessing steps differ from the structured structure. You can find the article in which I explain the data preprocessing steps for NLP here.
Data Cleaning: Data preprocessing involves cleaning up unnecessary or noisy data. Steps such as removing unnecessary information such as special characters, numbers, punctuation marks, correcting typos or removing irregularities improve the quality of the data.