{"id":46662,"date":"2021-09-01T00:00:00","date_gmt":"2021-09-01T07:00:00","guid":{"rendered":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/blog\/movie-review-classification-using-nlp-griddb-and-python\/"},"modified":"2025-11-13T12:55:33","modified_gmt":"2025-11-13T20:55:33","slug":"movie-review-classification-using-nlp-griddb-and-python","status":"publish","type":"post","link":"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/","title":{"rendered":"Movie Review Classification Using NLP, GridDB, and Python"},"content":{"rendered":"<h2>Introduction<\/h2>\n<p>In this tutorial, we will be classifying movie reviews based on sentimental analysis using an NLP Model. This is an application-based tutorial where we will be using a pre-trained LSTM model from the Allen NLP library. The outline of the tutorial is as follows:<\/p>\n<ol>\n<li>Setting up the environment<\/li>\n<li>All about the Dataset<\/li>\n<li>Data Preprocessing<\/li>\n<li>Loading the Allen NLP model<\/li>\n<li>Making predictions<\/li>\n<li>Evaluating the results<\/li>\n<\/ol>\n<p>The full Jupyter file can be seen on our <a href=\"https:\/\/github.com\/griddbnet\/Blogs\/blob\/main\/Movie%20Review%20Classification%20Using%20NLP%2C%20GridDB%2C%20and%20Python\/Movie%20Review%20classification%20using%20NLP%2C%20GridDB%20and%20Python.ipynb\">GitHub Page<\/a><\/p>\n<h2>Setting up the environment<\/h2>\n<p>This tutorial is carried out in Jupyter Notebooks (Anaconda version 4.8.3) with Python version 3.8 on Windows 10 Operating system. Following packages need to be installed before you continue with the code:<\/p>\n<ol>\n<li><a href=\"https:\/\/pandas.pydata.org\/docs\/getting_started\/install.html\">Pandas<\/a><\/li>\n<li><a href=\"https:\/\/pypi.org\/project\/allennlp\/\">allennlp<\/a><\/li>\n<li><a href=\"https:\/\/pypi.org\/project\/allennlp-models\/\">allennlp-models<\/a><\/li>\n<li><a href=\"https:\/\/pypi.org\/project\/nltk\/\">nltk<\/a><\/li>\n<li><a href=\"https:\/\/pypi.org\/project\/scikit-learn\/\">scikit-learn<\/a><\/li>\n<\/ol>\n<p>You can install the above-mentioned packages using <code>pip<\/code> or <code>conda<\/code>. Simply type <code>pip install package-name<\/code> or <code>conda install package-name<\/code> in the command line.<\/p>\n<p>To access <a href=\"https:\/\/github.com\/griddb\/python_client\">GridDB&#8217;s database through Python<\/a>, the following packages will be required:<\/p>\n<ol>\n<li>GridDB C-client<\/li>\n<li>SWIG (Simplified Wrapper and Interface Generator)<\/li>\n<li>GridDB Python-client<\/li>\n<\/ol>\n<h2>All About the Dataset<\/h2>\n<p>We are using the IMDB Sentiment Analysis Dataset which is available publicly on <a href=\"https:\/\/www.kaggle.com\/columbine\/imdb-dataset-sentiment-analysis-in-csv-format\/version\/1\">Kaggle<\/a>. The format of the dataset is pretty simple &#8211; it has 2 attributes:<\/p>\n<ol>\n<li>Movie Review (string)<\/li>\n<li>Sentiment Label (int) &#8211; Binary<\/li>\n<\/ol>\n<p>A label &#8216;0&#8217; represents a negative movie review whereas &#8216;1&#8217; represents a positive movie review. Since we will be using a pre-trained model, there is no need to download the train and validation dataset. We will be utilizing only the test dataset which has 5000 instances. Once you download the dataset, put it in the same working directory.<\/p>\n<p>Now let&#8217;s go ahead and load the dataset in our python environment<\/p>\n<h3>Loading the Data<\/h3>\n<p>GridDB has made it easier to work with data as we can directly call the database using its python-client and load it in the form of pandas dataframe.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">import griddb_python as griddb\nimport pandas as pd\n\nsql_statement = ('SELECT * FROM movie_review_test')\nmovie_review_test = pd.read_sql_query(sql_statement, cont)<\/code><\/pre>\n<\/div>\n<p>The <code>cont<\/code> variable has the container information in which you have your data stored. A detailed <a href=\"https:\/\/griddb.net\/en\/blog\/using-pandas-dataframes-with-griddb\/\">tutorial on reading and writing to GridDB using Pandas<\/a> is available on the blog.<\/p>\n<p>Alternatively, if you have the CSV file, you can use the read_csv() function of pandas. The outcome will be the same in both scenarios<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">import pandas as pd\n\nmovie_review_test = pd.read_csv(\"movie_review_test.csv\")<\/code><\/pre>\n<\/div>\n<p>Let&#8217;s print out the first five rows to get a little sneak peak into our data<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">movie_review_test.head()<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          text\n        <\/th>\n<th>\n          label\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          0\n        <\/th>\n<td>\n          I always wrote this series off as being a comp&#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          1st watched 12\/7\/2002 &#8211; 3 out of 10(Dir-Steve &#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          2\n        <\/th>\n<td>\n          This movie was so poorly written and directed &#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          3\n        <\/th>\n<td>\n          The most interesting thing about Miryang (Secr&#8230;\n        <\/td>\n<td>\n          1\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          4\n        <\/th>\n<td>\n          when i first read about &#8220;berlin am meer&#8221; i did&#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">len(movie_review_test)<\/code><\/pre>\n<\/div>\n<pre><code>5000\n<\/code><\/pre>\n<h2>Data Preprocessing<\/h2>\n<p>Data Preprocessing is an important step to avoid getting any unexpected behaviour from the machine learning model. Null values or missing values tend to mess with the overall results if not dealt with properly. Let&#8217;s see if our data contains any null values.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">movie_review_test.isna().sum()<\/code><\/pre>\n<\/div>\n<pre><code>text     0\nlabel    0\ndtype: int64\n<\/code><\/pre>\n<p>Great! Fortunately, we have zero null\/missing values in our test dataset. However, if you do encounter null values, consider dropping them or replacing them before moving further.<\/p>\n<h3>Removing Punctuation and Stop Words<\/h3>\n<p>Punctuation and stop words only increase the total word limit of a text. They do not contribute to model learning and serve majorly as noise. It is, therefore, important to remove those before the training step. In our case, although there is no training step, we still want to make sure that the input we&#8217;re providing is valid and appropriate. You can extend this step for the training dataset as well.<\/p>\n<p>Various libraries provide a list of stopwords. We&#8217;ll be using the nltk library for this task. Note that the list of stop words depend on package to package. You might get a slightly different result if you&#8217;re using some other library, say spacy.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">from nltk.corpus import stopwords\nimport nltk<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">stop = stopwords.words('english')<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">len(stop)<\/code><\/pre>\n<\/div>\n<pre><code>179\n<\/code><\/pre>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">type(stop)<\/code><\/pre>\n<\/div>\n<pre><code>list\n<\/code><\/pre>\n<p>We now have a list of 179 stopwords. You can add some custom words to the list as well. In fact, let&#8217;s go ahead and add a couple of words to the stopwords list.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">extra_words = ['Yeah', 'Okay']\nfor word in extra_words:\n    if word not in stop:\n        stop.append(word)<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">len(stop)<\/code><\/pre>\n<\/div>\n<pre><code>181\n<\/code><\/pre>\n<p>Alternatively, you can use the <code>extend()<\/code> to append all the items of the list. The <code>if<\/code> condition inside the for loop just makes sure we&#8217;re not adding the same word twice.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">movie_review_test['text'] = movie_review_test['text'].apply(lambda words: ' '.join(word for word in words.split() if word not in stop))<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">movie_review_test.head()<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          text\n        <\/th>\n<th>\n          label\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          0\n        <\/th>\n<td>\n          I always wrote series complete stink-fest Jim &#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          1st watched 12\/7\/2002 &#8211; 3 10(Dir-Steve Purcell&#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          2\n        <\/th>\n<td>\n          This movie poorly written directed I fell asle&#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          3\n        <\/th>\n<td>\n          The interesting thing Miryang (Secret Sunshine&#8230;\n        <\/td>\n<td>\n          1\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          4\n        <\/th>\n<td>\n          first read &#8220;berlin meer&#8221; expect much. thought &#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>As we can see, personal pronouns such as &#8216;I&#8217;, &#8216;we&#8217;, etc. have been removed. Let&#8217;s go ahead and remove the punctuation as well.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">movie_review_test['text'] = movie_review_test['text'].str.lower()\nmovie_review_test['text'] = movie_review_test['text'].str.replace('[^ws]','')<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">movie_review_test.head()<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          text\n        <\/th>\n<th>\n          label\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          0\n        <\/th>\n<td>\n          i always wrote series complete stinkfest jim b&#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          1st watched 1272002 3 10dirsteve purcell typi&#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          2\n        <\/th>\n<td>\n          this movie poorly written directed i fell asle&#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          3\n        <\/th>\n<td>\n          the interesting thing miryang secret sunshine &#8230;\n        <\/td>\n<td>\n          1\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          4\n        <\/th>\n<td>\n          first read berlin meer expect much thought rig&#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>Now that our data is ready to be used, let&#8217;s load up our model and start making some predictions!<\/p>\n<h1>Loading the Allen NLP Model<\/h1>\n<p>Allen NLP has made available a lot of machine learning models targeting different problem statements. We will be using the <a href=\"https:\/\/paperswithcode.com\/model\/glove-lstm\">GLoVe-LSTM binary classifier<\/a> for our movie review dataset. As per the official documentation, the model achieved an overall accuracy of 87% on the <a href=\"https:\/\/nlp.stanford.edu\/sentiment\/treebank.html\">Stanford Sentiment Treebank<\/a>. A <a href=\"https:\/\/demo.allennlp.org\/sentiment-analysis\/glove-sentiment-analysis\">live demo<\/a> of the model is available on the allennlp&#8217;s official website.<\/p>\n<p>Let&#8217;s go ahead and load our predictor.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">from allennlp.predictors.predictor import Predictor\nimport allennlp_models.tagging<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">predictor = Predictor.from_path(\"https:\/\/storage.googleapis.com\/allennlp-public-models\/basic_stanford_sentiment_treebank-2020.06.09.tar.gz\")<\/code><\/pre>\n<\/div>\n<pre><code>error loading _jsonnet (this is expected on Windows), treating C:UsersSHRIPR~2AppDataLocalTemptmpfjmtd8u3config.json as plain json\n<\/code><\/pre>\n<p>Note that these models can be heavy and if you have a GPU enabled system, simply pass the argument <code>cuda_device=0<\/code> in the above <code>predictor<\/code> function.<\/p>\n<p>To check if the predictor works fine, let&#8217;s pass a sample text review and see what kind of output do we get.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">sample_review = \"This movie was so great. I laughed and cried, a lot!\"<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">predictor.predict(sample_review)<\/code><\/pre>\n<\/div>\n<pre><code>'0'\n<\/code><\/pre>\n<p>As we can see, the predictor returns a dictionary with 5 keys &#8211; <code>logits, probs, token_ids, label,<\/code> and, <code>tokens<\/code>. Since we know the sample review is a positive one, we can say that the model correctly returned a <code>label '1'<\/code>.<\/p>\n<p>In addition to the label, the <code>probs<\/code> list also tells us the confidence score or probability of each label, which in our case are 0 or 1. The first item of the <code>probs<\/code> list i.e. the probability of label &#8216;1&#8217; is 0.98 (or 98%) which implies that the model was 98% confident that the review was positive.<\/p>\n<p>Now we know that the predictor is working fine, it is time to make some predictions<\/p>\n<h2>Making Predictions<\/h2>\n<p>We&#8217;ll define a predict function that takes a movie review and returns the label as an integer. Note that the original labels are of type <code>int<\/code>. It&#8217;ll be easier to compare the actual and predicted value if they&#8217;re of the same data type.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">def predict_review(movie_review):\n    return (int(predictor.predict(movie_review)['label']))<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">movie_review_test['predicted_label'] = movie_review_test['text'].apply(predict_review)<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">movie_review_test.head()<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          text\n        <\/th>\n<th>\n          label\n        <\/th>\n<th>\n          predicted_label\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          0\n        <\/th>\n<td>\n          I always wrote this series off as being a comp&#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          1\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          1st watched 12\/7\/2002 &#8211; 3 out of 10(Dir-Steve &#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          2\n        <\/th>\n<td>\n          This movie was so poorly written and directed &#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          3\n        <\/th>\n<td>\n          The most interesting thing about Miryang (Secr&#8230;\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          1\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          4\n        <\/th>\n<td>\n          when i first read about &#8220;berlin am meer&#8221; i did&#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          1\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>Now we simply need to calculate the accuracy of our model. The prediction cell took 6 minutes to execute for 5000 instances because it was running on CPU and these models can be heavy. If you&#8217;ll be utilizing the code for large data, consider using a GPU.<\/p>\n<h1>Evaluating the results<\/h1>\n<p>Allen NLP has their own set of metrics for evaluation. For the sake of simplicity, we&#8217;ll be using the scikit-learn library. You can find more information on Allen NLP metrics <a href=\"http:\/\/docs.allennlp.org\/v0.9.0\/api\/allennlp.training.metrics.html\">here<\/a>.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">from sklearn.metrics import accuracy_score<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">actual = movie_review_test['label']\npredicted = movie_review_test['predicted_label']<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">accuracy = accuracy_score(actual, predicted)<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">accuracy<\/code><\/pre>\n<\/div>\n<pre><code>0.7208\n<\/code><\/pre>\n<p>Our model has an overall accuracy of 72% on the test dataset. That&#8217;s decent for starters, right? You can save the predictions in a CSV file using the <code>pd.to_csv(file_path)<\/code>. Go ahead and try the code for yourself.<\/p>\n<p>Happy coding!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction In this tutorial, we will be classifying movie reviews based on sentimental analysis using an NLP Model. This is an application-based tutorial where we will be using a pre-trained LSTM model from the Allen NLP library. The outline of the tutorial is as follows: Setting up the environment All about the Dataset Data Preprocessing [&hellip;]<\/p>\n","protected":false},"author":41,"featured_media":27741,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[121],"tags":[],"class_list":["post-46662","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.1.1 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Movie Review Classification Using NLP, GridDB, and Python | GridDB: Open Source Time Series Database for IoT<\/title>\n<meta name=\"description\" content=\"Introduction In this tutorial, we will be classifying movie reviews based on sentimental analysis using an NLP Model. This is an application-based\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Movie Review Classification Using NLP, GridDB, and Python | GridDB: Open Source Time Series Database for IoT\" \/>\n<meta property=\"og:description\" content=\"Introduction In this tutorial, we will be classifying movie reviews based on sentimental analysis using an NLP Model. This is an application-based\" \/>\n<meta property=\"og:url\" content=\"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/\" \/>\n<meta property=\"og:site_name\" content=\"GridDB: Open Source Time Series Database for IoT\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/griddbcommunity\/\" \/>\n<meta property=\"article:published_time\" content=\"2021-09-01T07:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-11-13T20:55:33+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/griddb.net\/wp-content\/uploads\/2021\/08\/architecture.jpeg\" \/>\n\t<meta property=\"og:image:width\" content=\"1999\" \/>\n\t<meta property=\"og:image:height\" content=\"1333\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"griddb-admin\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@GridDBCommunity\" \/>\n<meta name=\"twitter:site\" content=\"@GridDBCommunity\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"griddb-admin\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/\"},\"author\":{\"name\":\"griddb-admin\",\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233\"},\"headline\":\"Movie Review Classification Using NLP, GridDB, and Python\",\"datePublished\":\"2021-09-01T07:00:00+00:00\",\"dateModified\":\"2025-11-13T20:55:33+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/\"},\"wordCount\":1214,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/griddb.net\/en\/#organization\"},\"image\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/#primaryimage\"},\"thumbnailUrl\":\"\/wp-content\/uploads\/2021\/08\/architecture.jpeg\",\"articleSection\":[\"Blog\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/\",\"url\":\"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/\",\"name\":\"Movie Review Classification Using NLP, GridDB, and Python | GridDB: Open Source Time Series Database for IoT\",\"isPartOf\":{\"@id\":\"https:\/\/griddb.net\/en\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/#primaryimage\"},\"thumbnailUrl\":\"\/wp-content\/uploads\/2021\/08\/architecture.jpeg\",\"datePublished\":\"2021-09-01T07:00:00+00:00\",\"dateModified\":\"2025-11-13T20:55:33+00:00\",\"description\":\"Introduction In this tutorial, we will be classifying movie reviews based on sentimental analysis using an NLP Model. This is an application-based\",\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/#primaryimage\",\"url\":\"\/wp-content\/uploads\/2021\/08\/architecture.jpeg\",\"contentUrl\":\"\/wp-content\/uploads\/2021\/08\/architecture.jpeg\",\"width\":1999,\"height\":1333},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/griddb.net\/en\/#website\",\"url\":\"https:\/\/griddb.net\/en\/\",\"name\":\"GridDB: Open Source Time Series Database for IoT\",\"description\":\"GridDB is an open source time-series database with the performance of NoSQL and convenience of SQL\",\"publisher\":{\"@id\":\"https:\/\/griddb.net\/en\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/griddb.net\/en\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/griddb.net\/en\/#organization\",\"name\":\"Fixstars\",\"url\":\"https:\/\/griddb.net\/en\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png\",\"contentUrl\":\"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png\",\"width\":200,\"height\":83,\"caption\":\"Fixstars\"},\"image\":{\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/griddbcommunity\/\",\"https:\/\/x.com\/GridDBCommunity\",\"https:\/\/www.linkedin.com\/company\/griddb-by-toshiba\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233\",\"name\":\"griddb-admin\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g\",\"caption\":\"griddb-admin\"},\"url\":\"https:\/\/griddb.net\/en\/author\/griddb-admin\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Movie Review Classification Using NLP, GridDB, and Python | GridDB: Open Source Time Series Database for IoT","description":"Introduction In this tutorial, we will be classifying movie reviews based on sentimental analysis using an NLP Model. This is an application-based","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/","og_locale":"en_US","og_type":"article","og_title":"Movie Review Classification Using NLP, GridDB, and Python | GridDB: Open Source Time Series Database for IoT","og_description":"Introduction In this tutorial, we will be classifying movie reviews based on sentimental analysis using an NLP Model. This is an application-based","og_url":"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/","og_site_name":"GridDB: Open Source Time Series Database for IoT","article_publisher":"https:\/\/www.facebook.com\/griddbcommunity\/","article_published_time":"2021-09-01T07:00:00+00:00","article_modified_time":"2025-11-13T20:55:33+00:00","og_image":[{"width":1999,"height":1333,"url":"https:\/\/griddb.net\/wp-content\/uploads\/2021\/08\/architecture.jpeg","type":"image\/jpeg"}],"author":"griddb-admin","twitter_card":"summary_large_image","twitter_creator":"@GridDBCommunity","twitter_site":"@GridDBCommunity","twitter_misc":{"Written by":"griddb-admin","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/#article","isPartOf":{"@id":"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/"},"author":{"name":"griddb-admin","@id":"https:\/\/griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233"},"headline":"Movie Review Classification Using NLP, GridDB, and Python","datePublished":"2021-09-01T07:00:00+00:00","dateModified":"2025-11-13T20:55:33+00:00","mainEntityOfPage":{"@id":"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/"},"wordCount":1214,"commentCount":0,"publisher":{"@id":"https:\/\/griddb.net\/en\/#organization"},"image":{"@id":"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2021\/08\/architecture.jpeg","articleSection":["Blog"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/","url":"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/","name":"Movie Review Classification Using NLP, GridDB, and Python | GridDB: Open Source Time Series Database for IoT","isPartOf":{"@id":"https:\/\/griddb.net\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/#primaryimage"},"image":{"@id":"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2021\/08\/architecture.jpeg","datePublished":"2021-09-01T07:00:00+00:00","dateModified":"2025-11-13T20:55:33+00:00","description":"Introduction In this tutorial, we will be classifying movie reviews based on sentimental analysis using an NLP Model. This is an application-based","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/griddb.net\/en\/blog\/movie-review-classification-using-nlp-griddb-and-python\/#primaryimage","url":"\/wp-content\/uploads\/2021\/08\/architecture.jpeg","contentUrl":"\/wp-content\/uploads\/2021\/08\/architecture.jpeg","width":1999,"height":1333},{"@type":"WebSite","@id":"https:\/\/griddb.net\/en\/#website","url":"https:\/\/griddb.net\/en\/","name":"GridDB: Open Source Time Series Database for IoT","description":"GridDB is an open source time-series database with the performance of NoSQL and convenience of SQL","publisher":{"@id":"https:\/\/griddb.net\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/griddb.net\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/griddb.net\/en\/#organization","name":"Fixstars","url":"https:\/\/griddb.net\/en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/griddb.net\/en\/#\/schema\/logo\/image\/","url":"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png","contentUrl":"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png","width":200,"height":83,"caption":"Fixstars"},"image":{"@id":"https:\/\/griddb.net\/en\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/griddbcommunity\/","https:\/\/x.com\/GridDBCommunity","https:\/\/www.linkedin.com\/company\/griddb-by-toshiba"]},{"@type":"Person","@id":"https:\/\/griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233","name":"griddb-admin","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/griddb.net\/en\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g","caption":"griddb-admin"},"url":"https:\/\/griddb.net\/en\/author\/griddb-admin\/"}]}},"_links":{"self":[{"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/posts\/46662","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/users\/41"}],"replies":[{"embeddable":true,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/comments?post=46662"}],"version-history":[{"count":1,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/posts\/46662\/revisions"}],"predecessor-version":[{"id":51337,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/posts\/46662\/revisions\/51337"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/media\/27741"}],"wp:attachment":[{"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/media?parent=46662"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/categories?post=46662"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/tags?post=46662"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}