Skip to main content

Posts

Showing post about

codeBox.js

Analizing IMDB data (movie review) for sentiment analysis

This is the first neural networks project I made with preprocessed data from IMDB to identify sentiments (positive or negative) from a dataset of 50.000 (25.000 for training and 25.000 for testing). I tried to detail every step and decision I made while creating the model. In the end, the neural network model was able to classify with an accuracy of 81.1% or misclassify 11.9% of the data (around 3000 movie reviews). This is a high error margin considering that an acceptable error must be between 3% and 5%, but the model, in general, helped me and gave me clues to develop a new version. At the same time, I learned a slight introduction to Natural Language Processing, a topic new to me. 1. The dataset : IMDB (Internet Movie Database) ¶ References: ¶ Maas, A., Daly, R., Pham, P., Huang, D., Ng, A., & Potts, C. (2011). Learning Word Vectors for Sentiment Analysis. IMDB movie review sentiment classification dataset ...