{"id":46692,"date":"2022-03-23T00:00:00","date_gmt":"2022-03-23T07:00:00","guid":{"rendered":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/"},"modified":"2025-11-13T12:55:52","modified_gmt":"2025-11-13T20:55:52","slug":"modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb","status":"publish","type":"post","link":"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/","title":{"rendered":"Modelling S&#038;P 500 Index Price Based on U.S. Economic Indicators using Python and GridDB"},"content":{"rendered":"<p>Economic indicators have been used in numerous studies to forecast stock prices using well-known statistical methods. The rising power of computers since *2010 has resulted in the development of new methods of machine learning. To discover whether those algorithms can predict the stock market using a variety of mathematical and statistical methods would be interesting. The purpose of this article is to model the monthly price of the S&amp;P 500 index based on U.S. economic indicators using GridDB to extract the data, followed by performing the statistical tests and finally building the machine learning model.<\/p>\n<p>The outline of the tutorial is as follows:<\/p>\n<ol>\n<li>Prerequisites and Environment setup<\/li>\n<li>Dataset overview<\/li>\n<li>Importing required libraries<\/li>\n<li>Loading the dataset<\/li>\n<li>Exploratory Data Analysis &amp; Feature Selection<\/li>\n<li>Building and Training a Machine Learning Model<\/li>\n<li>Conclusion<\/li>\n<\/ol>\n<h1>Prerequisites and Environment setup<\/h1>\n<p>This tutorial is carried out in Anaconda Navigator (Python version \u2013 3.8.3) on Windows Operating System. The following packages need to be installed before you continue with the tutorial \u2013<\/p>\n<ol>\n<li>\n<p>Pandas<\/p>\n<\/li>\n<li>\n<p>NumPy<\/p>\n<\/li>\n<li>\n<p>Scikit-learn<\/p>\n<\/li>\n<li>\n<p>Matplotlib<\/p>\n<\/li>\n<li>\n<p>Statsmodels<\/p>\n<\/li>\n<li>\n<p>griddb_python<\/p>\n<\/li>\n<\/ol>\n<p>You can install these packages in Conda\u2019s virtual environment using <code>conda install package-name<\/code>. In case you are using Python directly via terminal\/command prompt, <code>pip install package-name<\/code> will do the work.<\/p>\n<h3>GridDB installation<\/h3>\n<p>While loading the dataset, this tutorial will cover two methods \u2013 Using GridDB as well as Using Pandas. To access GridDB using Python, the following packages also need to be installed beforehand:<\/p>\n<ol>\n<li><a href=\"https:\/\/github.com\/griddb\/c_client\">GridDB C-client<\/a><\/li>\n<li>SWIG (Simplified Wrapper and Interface Generator)<\/li>\n<li><a href=\"https:\/\/github.com\/griddb\/python_client\">GridDB Python Client<\/a><\/li>\n<\/ol>\n<h1>1&#46; Dataset Overview<\/h1>\n<p>The indicators that are recognised to have the most significant impact on stock market return in general and on S&amp;P 500 in particular can be attributed to the following categories: general macroeconomic indicators, labour market indicators (unemployment rate and jobs reports), real estate indicators, credit market indicators, monetary supply indicators, consumer (household) financial behaviour indicators and commodity market indicators.<\/p>\n<p>The S&amp;P 500 index close price was modelling in this paper. The study used a python programming language with many libraries in Google Colab environment. The analysis period between 1970-01-01 \/ 2018-04-01, and both S&amp;P 500 index close price and U.S economic indicators data frequency are month.<\/p>\n<p>Feature Description:<\/p>\n<p>1) <code>SP500<\/code> &#8211; Price of S&amp;P 500 at respective month (Units: U.S. Dollars)<\/p>\n<p>2) <code>INTDSRUSM193N<\/code> &#8211; Interest Rates, Discount Rate for United States (Units: Percent per Annum)<\/p>\n<p>3) <code>BUSLOANS<\/code> &#8211; Commercial and Industrial Loans, All Commercial Banks (Units: Billions of U.S. Dollars)<\/p>\n<p>4) <code>MPRIME<\/code> &#8211; Bank Prime Loan Rate (Units: Percent)<\/p>\n<p>5) <code>FEDFUNDS<\/code> &#8211; Federal Funds Effective Rate (The federal funds rate is the interest rate at which depository institutions trade federal funds with each other overnight.) (Units: Percent)<\/p>\n<p>6) <code>CURRCIR<\/code> &#8211; Currency in Circulation (Units: Billions of Dollars)<\/p>\n<p>7) <code>PSAVERT<\/code> &#8211; Personal Saving Rate (Personal saving as a percentage of disposable personal income (DPI), frequently referred to as &#8220;the personal saving rate,&#8221; is calculated as the ratio of personal saving to DPI.)(Units: Percent)<\/p>\n<p>8) <code>PERMIT<\/code> &#8211; New Privately-Owned Housing Units Authorized in Permit-Issuing Places: Total Units (Units: Billions of Dollars)<\/p>\n<p>9) <code>INDPRO<\/code> &#8211; Industrial Production: Total Index (The Industrial Production Index (INDPRO) is an economic indicator that measures real output for all facilities located in the United States manufacturing, mining, and electric, and gas utilities) (Units: Index 2017=100)<\/p>\n<p>10) <code>PMSAVE<\/code> &#8211; Personal Saving (Units: Billions of Dollars)<\/p>\n<p>11) <code>DAUTOSAAR<\/code> &#8211; Motor Vehicle Retail Sales: Domestic Autos (Units: Millions of Units)<\/p>\n<p>12) <code>UNEMPLOY<\/code> &#8211; Unemployment Level (Units: Thousands of Persons)<\/p>\n<p>13) <code>CPIAUCSL<\/code> &#8211; Consumer Price Index for All Urban Consumers: All Items in U.S. City Average (Units: Index 1982-1984=100)<\/p>\n<p>The dataset is available publicly and can be downloaded from <code>finance.yahoo.com<\/code> and <code>fred.stlouisfed.org<\/code> website. The obtained data were combined into one data set, which was checked for missing values.<\/p>\n<h1>2&#46; Importing Required Libraries<\/h1>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">import griddb_python as griddb\n\nimport numpy as np\nimport pandas as pd\nimport matplotlib.pyplot as plt\nimport datetime\n\nimport statsmodels.api as sm\nfrom statsmodels.stats.outliers_influence import variance_inflation_factor\n\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.linear_model import LinearRegression\nfrom sklearn.ensemble import RandomForestRegressor\nfrom sklearn.metrics import r2_score\nfrom sklearn.metrics import mean_squared_error\n\nimport warnings\nwarnings.filterwarnings('ignore')\n%matplotlib inline<\/code><\/pre>\n<\/div>\n<h1>3&#46; Loading the Dataset<\/h1>\n<p>Let\u2019s proceed and load the dataset into our notebook.<\/p>\n<h2>3&#46;a Using GridDB<\/h2>\n<p>Toshiba GridDB\u2122 is a highly scalable NoSQL database best suited for IoT and Big Data. The foundation of GridDB\u2019s principles is based upon offering a versatile data store that is optimized for IoT, provides high scalability, tuned for high performance, and ensures high reliability.<\/p>\n<p>To store large amounts of data, a CSV file can be cumbersome. GridDB serves as a perfect alternative as it in open-source and a highly scalable database. GridDB is a scalable, in-memory, No SQL database which makes it easier for you to store large amounts of data. If you are new to GridDB, a tutorial on <a href=\"https:\/\/griddb.net\/en\/blog\/using-pandas-dataframes-with-griddb\/\">reading and writing to GridDB<\/a> can be useful.<\/p>\n<p>Assuming that you have already set up your database, we will now write the SQL query in python to load our dataset.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">sql_statement = ('SELECT * FROM us_economic_data')\ndataset = pd.read_sql_query(sql_statement, cont)<\/code><\/pre>\n<\/div>\n<p>Note that the <code>cont<\/code> variable has the container information where our data is stored. Replace the <code>credit_card_dataset<\/code> with the name of your container. More info can be found in this tutorial <a href=\"https:\/\/griddb.net\/en\/blog\/using-pandas-dataframes-with-griddb\/\">reading and writing to GridDB<\/a>.<\/p>\n<p>When it comes to IoT and Big Data use cases, GridDB clearly stands out among other databases in the Relational and NoSQL space. Overall, GridDB offers multiple reliability features for mission-critical applications that require high availability and data retention.<\/p>\n<h2>3&#46;b Using Pandas<\/h2>\n<p>We can also use Pandas&#8217; <code>read_csv<\/code> function to load our data. Both of the above methods will lead to the same output as the data is loaded in the form of a pandas dataframe using either of the methods.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">us_economic_data = pd.read_csv('us_economic_data.csv')<\/code><\/pre>\n<\/div>\n<h1>4&#46; Exploratory Data Analysis &amp; Feature Selection<\/h1>\n<p>Once the dataset is loaded, let us now explore the dataset. We&#8217;ll print the first 10 rows of this dataset using head() function.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">us_economic_data.head()<\/code><\/pre>\n<\/div>\n<div style=\"overflow-y: hidden;white-space: nowrap\">\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          date\n        <\/th>\n<th>\n          SP500\n        <\/th>\n<th>\n          INTDSRUSM193N\n        <\/th>\n<th>\n          BUSLOANS\n        <\/th>\n<th>\n          MPRIME\n        <\/th>\n<th>\n          FEDFUNDS\n        <\/th>\n<th>\n          CURRCIR\n        <\/th>\n<th>\n          PSAVERT\n        <\/th>\n<th>\n          PMSAVE\n        <\/th>\n<th>\n          DAUTOSAAR\n        <\/th>\n<th>\n          UNEMPLOY\n        <\/th>\n<th>\n          INDPRO\n        <\/th>\n<th>\n          PERMIT\n        <\/th>\n<th>\n          CPIAUCSL\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          0\n        <\/th>\n<td>\n          1970-07-01\n        <\/td>\n<td>\n          75.72\n        <\/td>\n<td>\n          6.00\n        <\/td>\n<td>\n          107.6770\n        <\/td>\n<td>\n          8.00\n        <\/td>\n<td>\n          7.21\n        <\/td>\n<td>\n          54.699\n        <\/td>\n<td>\n          13.5\n        <\/td>\n<td>\n          104.0\n        <\/td>\n<td>\n          7.720\n        <\/td>\n<td>\n          4175\n        <\/td>\n<td>\n          37.8753\n        <\/td>\n<td>\n          1324.0\n        <\/td>\n<td>\n          38.9\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          1970-08-01\n        <\/td>\n<td>\n          77.92\n        <\/td>\n<td>\n          6.00\n        <\/td>\n<td>\n          108.5407\n        <\/td>\n<td>\n          8.00\n        <\/td>\n<td>\n          6.62\n        <\/td>\n<td>\n          54.766\n        <\/td>\n<td>\n          13.4\n        <\/td>\n<td>\n          103.9\n        <\/td>\n<td>\n          7.595\n        <\/td>\n<td>\n          4256\n        <\/td>\n<td>\n          37.8077\n        <\/td>\n<td>\n          1394.0\n        <\/td>\n<td>\n          39.0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          2\n        <\/th>\n<td>\n          1970-09-01\n        <\/td>\n<td>\n          82.58\n        <\/td>\n<td>\n          6.00\n        <\/td>\n<td>\n          109.5289\n        <\/td>\n<td>\n          7.83\n        <\/td>\n<td>\n          6.29\n        <\/td>\n<td>\n          54.931\n        <\/td>\n<td>\n          12.9\n        <\/td>\n<td>\n          100.3\n        <\/td>\n<td>\n          7.763\n        <\/td>\n<td>\n          4456\n        <\/td>\n<td>\n          37.5471\n        <\/td>\n<td>\n          1426.0\n        <\/td>\n<td>\n          39.2\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          3\n        <\/th>\n<td>\n          1970-10-01\n        <\/td>\n<td>\n          84.37\n        <\/td>\n<td>\n          6.00\n        <\/td>\n<td>\n          109.7740\n        <\/td>\n<td>\n          7.50\n        <\/td>\n<td>\n          6.20\n        <\/td>\n<td>\n          55.063\n        <\/td>\n<td>\n          13.1\n        <\/td>\n<td>\n          102.3\n        <\/td>\n<td>\n          5.981\n        <\/td>\n<td>\n          4591\n        <\/td>\n<td>\n          36.7960\n        <\/td>\n<td>\n          1564.0\n        <\/td>\n<td>\n          39.4\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          4\n        <\/th>\n<td>\n          1970-11-01\n        <\/td>\n<td>\n          84.28\n        <\/td>\n<td>\n          5.85\n        <\/td>\n<td>\n          110.1744\n        <\/td>\n<td>\n          7.28\n        <\/td>\n<td>\n          5.60\n        <\/td>\n<td>\n          55.865\n        <\/td>\n<td>\n          13.6\n        <\/td>\n<td>\n          105.8\n        <\/td>\n<td>\n          4.944\n        <\/td>\n<td>\n          4898\n        <\/td>\n<td>\n          36.5732\n        <\/td>\n<td>\n          1502.0\n        <\/td>\n<td>\n          39.6\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">df = us_economic_data<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\"># Changing the column name and then setting the date as an index od the dataframe\ndf.rename(columns = {'Unnamed: 0':'date'}, inplace = True)\ndf['date'] = pd.to_datetime(df['date'])\ndf = df.set_index('date')<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">df.head()<\/code><\/pre>\n<\/div>\n<div style=\"overflow-y: hidden;white-space: nowrap\">\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          SP500\n        <\/th>\n<th>\n          INTDSRUSM193N\n        <\/th>\n<th>\n          BUSLOANS\n        <\/th>\n<th>\n          MPRIME\n        <\/th>\n<th>\n          FEDFUNDS\n        <\/th>\n<th>\n          CURRCIR\n        <\/th>\n<th>\n          PSAVERT\n        <\/th>\n<th>\n          PMSAVE\n        <\/th>\n<th>\n          DAUTOSAAR\n        <\/th>\n<th>\n          UNEMPLOY\n        <\/th>\n<th>\n          INDPRO\n        <\/th>\n<th>\n          PERMIT\n        <\/th>\n<th>\n          CPIAUCSL\n        <\/th>\n<\/tr>\n<tr>\n<th>\n          date\n        <\/th>\n<th>\n        <\/th>\n<th>\n        <\/th>\n<th>\n        <\/th>\n<th>\n        <\/th>\n<th>\n        <\/th>\n<th>\n        <\/th>\n<th>\n        <\/th>\n<th>\n        <\/th>\n<th>\n        <\/th>\n<th>\n        <\/th>\n<th>\n        <\/th>\n<th>\n        <\/th>\n<th>\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          1970-07-01\n        <\/th>\n<td>\n          75.72\n        <\/td>\n<td>\n          6.00\n        <\/td>\n<td>\n          107.6770\n        <\/td>\n<td>\n          8.00\n        <\/td>\n<td>\n          7.21\n        <\/td>\n<td>\n          54.699\n        <\/td>\n<td>\n          13.5\n        <\/td>\n<td>\n          104.0\n        <\/td>\n<td>\n          7.720\n        <\/td>\n<td>\n          4175\n        <\/td>\n<td>\n          37.8753\n        <\/td>\n<td>\n          1324.0\n        <\/td>\n<td>\n          38.9\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1970-08-01\n        <\/th>\n<td>\n          77.92\n        <\/td>\n<td>\n          6.00\n        <\/td>\n<td>\n          108.5407\n        <\/td>\n<td>\n          8.00\n        <\/td>\n<td>\n          6.62\n        <\/td>\n<td>\n          54.766\n        <\/td>\n<td>\n          13.4\n        <\/td>\n<td>\n          103.9\n        <\/td>\n<td>\n          7.595\n        <\/td>\n<td>\n          4256\n        <\/td>\n<td>\n          37.8077\n        <\/td>\n<td>\n          1394.0\n        <\/td>\n<td>\n          39.0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1970-09-01\n        <\/th>\n<td>\n          82.58\n        <\/td>\n<td>\n          6.00\n        <\/td>\n<td>\n          109.5289\n        <\/td>\n<td>\n          7.83\n        <\/td>\n<td>\n          6.29\n        <\/td>\n<td>\n          54.931\n        <\/td>\n<td>\n          12.9\n        <\/td>\n<td>\n          100.3\n        <\/td>\n<td>\n          7.763\n        <\/td>\n<td>\n          4456\n        <\/td>\n<td>\n          37.5471\n        <\/td>\n<td>\n          1426.0\n        <\/td>\n<td>\n          39.2\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1970-10-01\n        <\/th>\n<td>\n          84.37\n        <\/td>\n<td>\n          6.00\n        <\/td>\n<td>\n          109.7740\n        <\/td>\n<td>\n          7.50\n        <\/td>\n<td>\n          6.20\n        <\/td>\n<td>\n          55.063\n        <\/td>\n<td>\n          13.1\n        <\/td>\n<td>\n          102.3\n        <\/td>\n<td>\n          5.981\n        <\/td>\n<td>\n          4591\n        <\/td>\n<td>\n          36.7960\n        <\/td>\n<td>\n          1564.0\n        <\/td>\n<td>\n          39.4\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1970-11-01\n        <\/th>\n<td>\n          84.28\n        <\/td>\n<td>\n          5.85\n        <\/td>\n<td>\n          110.1744\n        <\/td>\n<td>\n          7.28\n        <\/td>\n<td>\n          5.60\n        <\/td>\n<td>\n          55.865\n        <\/td>\n<td>\n          13.6\n        <\/td>\n<td>\n          105.8\n        <\/td>\n<td>\n          4.944\n        <\/td>\n<td>\n          4898\n        <\/td>\n<td>\n          36.5732\n        <\/td>\n<td>\n          1502.0\n        <\/td>\n<td>\n          39.6\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>We performed Augmented Dickey-Fuller to determine if S&amp;P 500 close prices is stationary. Later, we made visualization of the data between the closing price and each independent variable to determine if there is a linear relationship between the variables and thus to select only those variables that it has.<\/p>\n<p>After, we checked whether the independent variables correlated with each other using the VIF test. Finally, because the indicators are measured in different scales, the data were normalized using z score criteria to produce accurate and reliable mode<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">def tsplot(y, figsize=(12, 7), syle='bmh'):\n    \n    if not isinstance(y, pd.Series):\n        y = pd.Series(y)\n        \n    with plt.style.context(style='bmh'):\n        fig = plt.figure(figsize=figsize)\n        layout = (2,1)\n        ts_ax = plt.subplot2grid(layout, (0,0), colspan=2)\n        y.plot(ax=ts_ax)\n        p_value = sm.tsa.stattools.adfuller(y)[1]\n        ts_ax.set_title('Time Series Analysis Plotsn Dickey-Fuller: p={0:.5f}'.format(p_value))\n\n        plt.tight_layout()\n        \ntsplot(df['SP500'])\n\ndata_diff = df['SP500'] - df['SP500'].shift(1)\n\ntsplot(data_diff[1:])<\/code><\/pre>\n<\/div>\n<p><a href=\"https:\/\/griddb.net\/wp-content\/uploads\/2022\/03\/output_25_0.png\"><img fetchpriority=\"high\" decoding=\"async\" src=\"https:\/\/griddb.net\/wp-content\/uploads\/2022\/03\/output_25_0.png\" alt=\"\" width=\"856\" height=\"282\" class=\"aligncenter size-full wp-image-28084\" srcset=\"\/wp-content\/uploads\/2022\/03\/output_25_0.png 856w, \/wp-content\/uploads\/2022\/03\/output_25_0-300x99.png 300w, \/wp-content\/uploads\/2022\/03\/output_25_0-768x253.png 768w, \/wp-content\/uploads\/2022\/03\/output_25_0-600x198.png 600w\" sizes=\"(max-width: 856px) 100vw, 856px\" \/><\/a><\/p>\n<p><a href=\"https:\/\/griddb.net\/wp-content\/uploads\/2022\/03\/output_25_1.png\"><img decoding=\"async\" src=\"https:\/\/griddb.net\/wp-content\/uploads\/2022\/03\/output_25_1.png\" alt=\"\" width=\"856\" height=\"282\" class=\"aligncenter size-full wp-image-28084\" \/><\/a><\/p>\n<p>If we take 5% as the confidence level in our hypothesis of Dickey Fuller test, we can see that pvalue is much greater than 0.05 so the S&amp;P 500 is not stationary.<\/p>\n<p>To make the data stationary, it was necessary to create a new dependent variable, which was calculated from the current monthly price minus the previous month\u2019s price. In this case, the tests mentioned earlier were recalculated and the data were found to be stationary since pvalue comes out to be 0.<\/p>\n<p>The next step is to find if independent variables did not correlate with each other. The VIF test was chosen for this purpose. Variance Inflating factor (VIF) is used to test the presence of multicollinearity in a regression model.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">df.columns<\/code><\/pre>\n<\/div>\n<pre><code>Index(['SP500', 'INTDSRUSM193N', 'BUSLOANS', 'MPRIME', 'FEDFUNDS', 'CURRCIR',\n       'PSAVERT', 'PMSAVE', 'DAUTOSAAR', 'UNEMPLOY', 'INDPRO', 'PERMIT',\n       'CPIAUCSL'],\n      dtype='object')\n<\/code><\/pre>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\"># the independent variables set\nX = df[['INTDSRUSM193N', 'BUSLOANS', 'MPRIME', 'FEDFUNDS', 'CURRCIR',\n       'PSAVERT', 'PMSAVE', 'DAUTOSAAR', 'UNEMPLOY', 'INDPRO', 'PERMIT',\n       'CPIAUCSL']]\n  \n# VIF dataframe\nvif_data = pd.DataFrame()\nvif_data[\"feature\"] = X.columns\n  \n# calculating VIF for each feature\nvif_data[\"VIF\"] = [variance_inflation_factor(X.values, i)\n                          for i in range(len(X.columns))]\n  \nprint(vif_data)<\/code><\/pre>\n<\/div>\n<pre><code>          feature         VIF\n0   INTDSRUSM193N  130.622837\n1        BUSLOANS  147.284028\n2          MPRIME  399.916868\n3        FEDFUNDS  145.507021\n4         CURRCIR  112.882897\n5         PSAVERT   63.436200\n6          PMSAVE   68.862740\n7       DAUTOSAAR   60.502647\n8        UNEMPLOY   54.757359\n9          INDPRO  464.500794\n10         PERMIT   34.696174\n11       CPIAUCSL  897.072035\n<\/code><\/pre>\n<p>From the table it can be seen that the VIF test values for each variable are well above 10. In this case, the indicator with the highest VIF value is removed and the whole set is recalculated. This is done as long as all variables with values less than 10 remain.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\"># the independent variables set\nX = df[[  'FEDFUNDS', 'CURRCIR','PSAVERT', 'PERMIT',]]\n  \n# VIF dataframe\nvif_data = pd.DataFrame()\nvif_data[\"feature\"] = X.columns\n  \n# calculating VIF for each feature\nvif_data[\"VIF\"] = [variance_inflation_factor(X.values, i)\n                          for i in range(len(X.columns))]\n  \nprint(vif_data)<\/code><\/pre>\n<\/div>\n<pre><code>    feature       VIF\n0  FEDFUNDS  5.228717\n1   CURRCIR  2.532886\n2   PSAVERT  8.150866\n3    PERMIT  7.039288\n<\/code><\/pre>\n<p>We found that 11 of them did not meet the linear regression rule that the data should be linear.<\/p>\n<p>The last thing in our second stage was to take those 3 indicators together with S&amp;P 500 index price data and normalized them by z score criteria.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">df_new = df[['SP500','FEDFUNDS','CURRCIR','PSAVERT','PERMIT']]<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">df_z_scaled = df_new.copy()\n  \n# apply normalization techniques\nfor column in df_z_scaled.columns:\n    df_z_scaled[column] = (df_z_scaled[column] -\n                           df_z_scaled[column].mean()) \/ df_z_scaled[column].std()    \n  \n# view normalized data   \ndf_z_scaled.head()<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          SP500\n        <\/th>\n<th>\n          FEDFUNDS\n        <\/th>\n<th>\n          CURRCIR\n        <\/th>\n<th>\n          PSAVERT\n        <\/th>\n<th>\n          PERMIT\n        <\/th>\n<\/tr>\n<tr>\n<th>\n          date\n        <\/th>\n<th>\n        <\/th>\n<th>\n        <\/th>\n<th>\n        <\/th>\n<th>\n        <\/th>\n<th>\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          1970-07-01\n        <\/th>\n<td>\n          -1.019219\n        <\/td>\n<td>\n          0.500923\n        <\/td>\n<td>\n          -1.065075\n        <\/td>\n<td>\n          1.792466\n        <\/td>\n<td>\n          -0.165538\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1970-08-01\n        <\/th>\n<td>\n          -1.015924\n        <\/td>\n<td>\n          0.351568\n        <\/td>\n<td>\n          -1.064918\n        <\/td>\n<td>\n          1.758546\n        <\/td>\n<td>\n          0.002858\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1970-09-01\n        <\/th>\n<td>\n          -1.008946\n        <\/td>\n<td>\n          0.268030\n        <\/td>\n<td>\n          -1.064531\n        <\/td>\n<td>\n          1.588943\n        <\/td>\n<td>\n          0.079839\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1970-10-01\n        <\/th>\n<td>\n          -1.006266\n        <\/td>\n<td>\n          0.245247\n        <\/td>\n<td>\n          -1.064222\n        <\/td>\n<td>\n          1.656784\n        <\/td>\n<td>\n          0.411821\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1970-11-01\n        <\/th>\n<td>\n          -1.006401\n        <\/td>\n<td>\n          0.093359\n        <\/td>\n<td>\n          -1.062341\n        <\/td>\n<td>\n          1.826387\n        <\/td>\n<td>\n          0.262670\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<h1>5&#46; Machine Learning Model Building<\/h1>\n<p>Now, let&#8217;s proceed to building and evaluating machine learning models on our credit card dataset. We&#8217;ll first create <code>features<\/code> and <code>labels<\/code> for our model and split them into train and test samples. Test size has been kept as 20% of the total dataset size.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">features = df_z_scaled.drop(columns = ['SP500'], axis = 1)\nlabels = df_z_scaled[['SP500']]\n\nX_train, X_test, y_train, y_test = train_test_split(features, labels, test_size = 0.2, random_state = 0)<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">print(f\"Shape of training data: {X_train.shape}\")\nprint(f\"Shape of the training target data: {y_train.shape}\")\n\nprint(f\"Shape of test data: {X_test.shape}\")\nprint(f\"Shape of the test target data: {y_test.shape}\")<\/code><\/pre>\n<\/div>\n<pre><code>Shape of training data: (459, 4)\nShape of the training target data: (459, 1)\nShape of test data: (115, 4)\nShape of the test target data: (115, 1)\n<\/code><\/pre>\n<p>We started the third stage of the study by constructing a statistical linear regression model, which is our baseline.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">regr = LinearRegression()\n  \nregr.fit(X_train, y_train)\nprint(regr.score(X_test, y_test))<\/code><\/pre>\n<\/div>\n<pre><code>0.9351836569209164\n<\/code><\/pre>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">coefficients = pd.concat([pd.DataFrame(X.columns),pd.DataFrame(np.transpose(regr.coef_))], axis = 1)\ncoefficients<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          0\n        <\/th>\n<th>\n          0\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          0\n        <\/th>\n<td>\n          FEDFUNDS\n        <\/td>\n<td>\n          0.083735\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          CURRCIR\n        <\/td>\n<td>\n          1.003270\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          2\n        <\/th>\n<td>\n          PSAVERT\n        <\/td>\n<td>\n          -0.087568\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          3\n        <\/th>\n<td>\n          PERMIT\n        <\/td>\n<td>\n          0.116120\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p><code>Currency in Circulation<\/code> has the highest coefficient which means highest impact to model.<\/p>\n<p>After the model has been fit on our training data, we can proceed to predicting for our test set in order to evaluate the model performance. Lets store our predictions in <code>predicted<\/code>.<\/p>\n<p><code>Evaluation Metric<\/code>: 1) Coefficient of determination (R^2). This is the most important indicator of model confidence in the data, which is mandatory in all descriptions of regression models.<\/p>\n<p>2) Mean square error (MSE). This value measures the root mean square error between actual and predicted values.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">y_pred = regr.predict(X_test)\nmse = mean_squared_error(y_test, y_pred)\nprint(\"MSE: \", mse)\nprint(\"RMSE: \", mse*(1\/2.0)) \n\nr2 = r2_score(y_test, y_pred)\nprint('r2 score for Random Forest model is', r2)<\/code><\/pre>\n<\/div>\n<pre><code>MSE:  0.06275016232171571\nRMSE:  0.031375081160857854\nr2 score for Random Forest model is 0.9351836569209164\n<\/code><\/pre>\n<p><code>Random forest regression<\/code> is a tree ensemble learning technique. Prediction based on the trees is more accurate because it takes into account many predictions. This is because of the average value used. These algorithms are more stable because any changes in dataset can impact one tree but not the forest of trees.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">rfr = RandomForestRegressor()\n\nrfr.fit(X_train, y_train)\n\nscore = rfr.score(X_train, y_train)\nprint(\"R-squared:\", score)<\/code><\/pre>\n<\/div>\n<pre><code>R-squared: 0.9993703842202256\n<\/code><\/pre>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">ypred = rfr.predict(X_test)\n\nmse = mean_squared_error(y_test, ypred)\nprint(\"MSE: \", mse)\nprint(\"RMSE: \", mse*(1\/2.0)) \n\nr2 = r2_score(y_test, ypred)\nprint('r2 score for Random Forest model is', r2)<\/code><\/pre>\n<\/div>\n<pre><code>MSE:  0.0031918233911353254\nRMSE:  0.0015959116955676627\nr2 score for Random Forest model is 0.9967030791266005\n<\/code><\/pre>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">plt.figure(figsize=(5,5))\nplt.scatter(y_test['SP500'].values, ypred, c='crimson')\n\n\np1 = max(max(ypred), max(y_test['SP500'].values))\np2 = min(min(ypred), min(y_test['SP500'].values))\nplt.plot([p1, p2], [p1, p2], 'b-')\nplt.xlabel('True Values', fontsize=15)\nplt.ylabel('Predictions', fontsize=15)\nplt.axis('equal')\nplt.show()<\/code><\/pre>\n<\/div>\n<p><a href=\"https:\/\/griddb.net\/wp-content\/uploads\/2022\/03\/output_47_0.png\"><img decoding=\"async\" src=\"https:\/\/griddb.net\/wp-content\/uploads\/2022\/03\/output_47_0.png\" alt=\"\" width=\"856\" height=\"282\" class=\"aligncenter size-full wp-image-28084\" \/><\/a><\/p>\n<h3>Comparison between baseline and Random Forest Model<\/h3>\n<p>The ML Random Forest model was found to be the better machine learning model because it provided highest R2 and lowest error rate. The statistical linear regression model was improved by 6 % and the error rate was reduced 20 times.<\/p>\n<h1>6&#46; Conclusion<\/h1>\n<p>In this tutorial we identified 13 indicators that can have the most significant impact on stock markets and S&amp;P 500 index in particular. We examined two ways to import our data, ussing (1) GridDB and (2) Pandas. For large datasets, GridDB provides an excellent alternative to import data in your notebook as it is open-source and highly scalable. <a href=\"https:\/\/griddb.net\/en\/downloads\/\">Download GridDB<\/a> today!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Economic indicators have been used in numerous studies to forecast stock prices using well-known statistical methods. The rising power of computers since *2010 has resulted in the development of new methods of machine learning. To discover whether those algorithms can predict the stock market using a variety of mathematical and statistical methods would be interesting. [&hellip;]<\/p>\n","protected":false},"author":41,"featured_media":28152,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[121],"tags":[],"class_list":["post-46692","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.1.1 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Modelling S&amp;P 500 Index Price Based on U.S. Economic Indicators using Python and GridDB | GridDB: Open Source Time Series Database for IoT<\/title>\n<meta name=\"description\" content=\"Economic indicators have been used in numerous studies to forecast stock prices using well-known statistical methods. The rising power of computers since\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Modelling S&amp;P 500 Index Price Based on U.S. Economic Indicators using Python and GridDB | GridDB: Open Source Time Series Database for IoT\" \/>\n<meta property=\"og:description\" content=\"Economic indicators have been used in numerous studies to forecast stock prices using well-known statistical methods. The rising power of computers since\" \/>\n<meta property=\"og:url\" content=\"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/\" \/>\n<meta property=\"og:site_name\" content=\"GridDB: Open Source Time Series Database for IoT\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/griddbcommunity\/\" \/>\n<meta property=\"article:published_time\" content=\"2022-03-23T07:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-11-13T20:55:52+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/griddb.net\/wp-content\/uploads\/2022\/03\/macbook-air-iphone-038-stocks-scaled.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"2560\" \/>\n\t<meta property=\"og:image:height\" content=\"1700\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"griddb-admin\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@GridDBCommunity\" \/>\n<meta name=\"twitter:site\" content=\"@GridDBCommunity\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"griddb-admin\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"12 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/\"},\"author\":{\"name\":\"griddb-admin\",\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233\"},\"headline\":\"Modelling S&#038;P 500 Index Price Based on U.S. Economic Indicators using Python and GridDB\",\"datePublished\":\"2022-03-23T07:00:00+00:00\",\"dateModified\":\"2025-11-13T20:55:52+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/\"},\"wordCount\":1583,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/griddb.net\/en\/#organization\"},\"image\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/#primaryimage\"},\"thumbnailUrl\":\"\/wp-content\/uploads\/2022\/03\/macbook-air-iphone-038-stocks-scaled.jpg\",\"articleSection\":[\"Blog\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/\",\"url\":\"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/\",\"name\":\"Modelling S&P 500 Index Price Based on U.S. Economic Indicators using Python and GridDB | GridDB: Open Source Time Series Database for IoT\",\"isPartOf\":{\"@id\":\"https:\/\/griddb.net\/en\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/#primaryimage\"},\"thumbnailUrl\":\"\/wp-content\/uploads\/2022\/03\/macbook-air-iphone-038-stocks-scaled.jpg\",\"datePublished\":\"2022-03-23T07:00:00+00:00\",\"dateModified\":\"2025-11-13T20:55:52+00:00\",\"description\":\"Economic indicators have been used in numerous studies to forecast stock prices using well-known statistical methods. The rising power of computers since\",\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/#primaryimage\",\"url\":\"\/wp-content\/uploads\/2022\/03\/macbook-air-iphone-038-stocks-scaled.jpg\",\"contentUrl\":\"\/wp-content\/uploads\/2022\/03\/macbook-air-iphone-038-stocks-scaled.jpg\",\"width\":2560,\"height\":1700},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/griddb.net\/en\/#website\",\"url\":\"https:\/\/griddb.net\/en\/\",\"name\":\"GridDB: Open Source Time Series Database for IoT\",\"description\":\"GridDB is an open source time-series database with the performance of NoSQL and convenience of SQL\",\"publisher\":{\"@id\":\"https:\/\/griddb.net\/en\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/griddb.net\/en\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/griddb.net\/en\/#organization\",\"name\":\"Fixstars\",\"url\":\"https:\/\/griddb.net\/en\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png\",\"contentUrl\":\"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png\",\"width\":200,\"height\":83,\"caption\":\"Fixstars\"},\"image\":{\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/griddbcommunity\/\",\"https:\/\/x.com\/GridDBCommunity\",\"https:\/\/www.linkedin.com\/company\/griddb-by-toshiba\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233\",\"name\":\"griddb-admin\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g\",\"caption\":\"griddb-admin\"},\"url\":\"https:\/\/griddb.net\/en\/author\/griddb-admin\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Modelling S&P 500 Index Price Based on U.S. Economic Indicators using Python and GridDB | GridDB: Open Source Time Series Database for IoT","description":"Economic indicators have been used in numerous studies to forecast stock prices using well-known statistical methods. The rising power of computers since","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/","og_locale":"en_US","og_type":"article","og_title":"Modelling S&P 500 Index Price Based on U.S. Economic Indicators using Python and GridDB | GridDB: Open Source Time Series Database for IoT","og_description":"Economic indicators have been used in numerous studies to forecast stock prices using well-known statistical methods. The rising power of computers since","og_url":"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/","og_site_name":"GridDB: Open Source Time Series Database for IoT","article_publisher":"https:\/\/www.facebook.com\/griddbcommunity\/","article_published_time":"2022-03-23T07:00:00+00:00","article_modified_time":"2025-11-13T20:55:52+00:00","og_image":[{"width":2560,"height":1700,"url":"https:\/\/griddb.net\/wp-content\/uploads\/2022\/03\/macbook-air-iphone-038-stocks-scaled.jpg","type":"image\/jpeg"}],"author":"griddb-admin","twitter_card":"summary_large_image","twitter_creator":"@GridDBCommunity","twitter_site":"@GridDBCommunity","twitter_misc":{"Written by":"griddb-admin","Est. reading time":"12 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/#article","isPartOf":{"@id":"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/"},"author":{"name":"griddb-admin","@id":"https:\/\/griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233"},"headline":"Modelling S&#038;P 500 Index Price Based on U.S. Economic Indicators using Python and GridDB","datePublished":"2022-03-23T07:00:00+00:00","dateModified":"2025-11-13T20:55:52+00:00","mainEntityOfPage":{"@id":"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/"},"wordCount":1583,"commentCount":0,"publisher":{"@id":"https:\/\/griddb.net\/en\/#organization"},"image":{"@id":"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2022\/03\/macbook-air-iphone-038-stocks-scaled.jpg","articleSection":["Blog"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/","url":"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/","name":"Modelling S&P 500 Index Price Based on U.S. Economic Indicators using Python and GridDB | GridDB: Open Source Time Series Database for IoT","isPartOf":{"@id":"https:\/\/griddb.net\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/#primaryimage"},"image":{"@id":"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2022\/03\/macbook-air-iphone-038-stocks-scaled.jpg","datePublished":"2022-03-23T07:00:00+00:00","dateModified":"2025-11-13T20:55:52+00:00","description":"Economic indicators have been used in numerous studies to forecast stock prices using well-known statistical methods. The rising power of computers since","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/griddb.net\/en\/blog\/modelling-sp-500-index-price-based-on-u-s-economic-indicators-using-python-and-griddb\/#primaryimage","url":"\/wp-content\/uploads\/2022\/03\/macbook-air-iphone-038-stocks-scaled.jpg","contentUrl":"\/wp-content\/uploads\/2022\/03\/macbook-air-iphone-038-stocks-scaled.jpg","width":2560,"height":1700},{"@type":"WebSite","@id":"https:\/\/griddb.net\/en\/#website","url":"https:\/\/griddb.net\/en\/","name":"GridDB: Open Source Time Series Database for IoT","description":"GridDB is an open source time-series database with the performance of NoSQL and convenience of SQL","publisher":{"@id":"https:\/\/griddb.net\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/griddb.net\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/griddb.net\/en\/#organization","name":"Fixstars","url":"https:\/\/griddb.net\/en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/griddb.net\/en\/#\/schema\/logo\/image\/","url":"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png","contentUrl":"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png","width":200,"height":83,"caption":"Fixstars"},"image":{"@id":"https:\/\/griddb.net\/en\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/griddbcommunity\/","https:\/\/x.com\/GridDBCommunity","https:\/\/www.linkedin.com\/company\/griddb-by-toshiba"]},{"@type":"Person","@id":"https:\/\/griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233","name":"griddb-admin","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/griddb.net\/en\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g","caption":"griddb-admin"},"url":"https:\/\/griddb.net\/en\/author\/griddb-admin\/"}]}},"_links":{"self":[{"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/posts\/46692","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/users\/41"}],"replies":[{"embeddable":true,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/comments?post=46692"}],"version-history":[{"count":1,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/posts\/46692\/revisions"}],"predecessor-version":[{"id":51366,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/posts\/46692\/revisions\/51366"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/media\/28152"}],"wp:attachment":[{"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/media?parent=46692"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/categories?post=46692"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/tags?post=46692"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}