Blog

Stress Detection using Machine Learning & GridDB

Stress significantly affects individuals’ well-being, productivity, and overall quality of life. Understanding and predicting stress levels can help take proactive measures to mitigate its adverse effects. W This article demonstrates how to develop a stress detection system using machine learning and deep learning techniques with the GridDB database. We will begin by retrieving a stress detection dataset from Kaggle, storing it in a GridDB container, and utilizing this data to train predictive models capable of estimating users’ perceived stress scores. GridDB, a high-performance NoSQL database, is particularly suited for managing complex and dynamic datasets. Its efficient in-memory processing and flexible data storage capabilities make it an ideal choice for real-time applications. Note: The codes for this tutorial are in my GridDB Blogs GitHub repository. Prerequisites You need to install the following libraries to run codes in this article. GridDB C Client GridDB Python client To install these libraries, follow the installation instructions on GridDB Python Package Index (Pypi). The code is executed in Google Colab, so you do not need to install other libraries. Run the following script to import the required libraries into your Python application. import pandas as pd import seaborn as sns import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.metrics import mean_absolute_error, mean_squared_error from sklearn.ensemble import RandomForestRegressor import tensorflow as tf from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Dropout, BatchNormalization from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint from tensorflow.keras.optimizers import Adam from tensorflow.keras.models import load_model import griddb_python as griddb Inserting Stress Detection Dataset into GridDB We will begin by inserting the stress detection dataset from Kaggle into GridDB. In a later section, we will retrieve data from the GridDB and train our machine-learning algorithms for user stress prediction. Downloading and Importing the Stress Detection Dataset from Kaggle You can download the stress detection dataset from Kaggle and import it into your Python application. # Dataset download link # https://www.kaggle.com/datasets/swadeshi/stress-detection-dataset?resource=download dataset = pd.read_csv(“stress_detection.csv”) print(f”The dataset consists of {dataset.shape[0]} rows and {dataset.shape[1]} columns”) dataset.head() Output: The dataset consists of 3000 records belonging to 100 users. For each user, 30 days of data are recorded for various attributes such as openness, sleep duration, screen time, mobility distance, and number of calls. The PSS_score column contains the perceived stress score, which ranges from 10 to 40. A higher score corresponds to a higher stress level. The following script displays various statistics for the PSS_score column. dataset[“PSS_score”].describe() Output: count 3000.000000 mean 24.701000 std 8.615781 min 10.000000 25% 17.000000 50% 25.000000 75% 32.000000 max 39.000000 Name: PSS_score, dtype: float64 Next, we will insert the user stress dataset into GridDB. Connect to GridDB You need to connect to a GridDB instance before inserting data into the GridDB. To do so, you must create a GridDB factory instance using the griddb.StoreFactory.get_instance() method. Next, you have to call the get_store method on the factory instance and pass the database host URL, cluster name, and user name and password. The get_store() method returns a grid store object that you can use to create containers in GridDB. To test whether the connection is successful, we retrieve a dummy container, container1, as shown in the script below. # GridDB connection details DB_HOST = “127.0.0.1:10001” DB_CLUSTER = “myCluster” DB_USER = “admin” DB_PASS = “admin” # creating a connection factory = griddb.StoreFactory.get_instance() try: gridstore = factory.get_store( notification_member = DB_HOST, cluster_name = DB_CLUSTER, username = DB_USER, password = DB_PASS ) container1 = gridstore.get_container(“container1”) if container1 == None: print(“Container does not exist”) print(“Successfully connected to GridDB”) except griddb.GSException as e: for i in range(e.get_error_stack_size()): print(“[“, i, “]”) print(e.get_error_code(i)) print(e.get_location(i)) print(e.get_message(i)) Output: Container does not exist Successfully connected to GridDB You should see the above message if the connection is successful. Create Container for User Stress Data in GridDB GridDB stores data in containers, which are specialized data structures for efficient data structure. The following script creates a GridDB container to store our stress detection dataset. We first remove any existing container with the name user_stress_data as we will use this name to create a new container. Next, we replace empty spaces in column names with an underscore since GridDB does not expect column names to have spaces. We will then map Pandas DataFrame data type to GridDB data types and create a column info list containing column names and corresponding GridDB data types, Next, we create a container info object and pass it the container name, the column info list, and the container type, which is COLLECTION for tabular data. Finally, we call the grid store’s put_container method and pass the container info object we created to it as a parameter. # drop container if already exists gridstore.drop_container(“user_stress_data”) # Clean column names to remove spaces or forbidden characters in the GridDB container dataset.columns = [col.strip().replace(” “, “_”) for col in dataset.columns] # Mapping from pandas data types to GridDB data types type_mapping = { ‘float64’: griddb.Type.DOUBLE, ‘int64’: griddb.Type.INTEGER, ‘object’: griddb.Type.STRING, ‘category’: griddb.Type.STRING # Map category to STRING for GridDB } # Generate column_info dynamically column_info = [[col, type_mapping[str(dtype)]] for col, dtype in dataset.dtypes.items()] # Define the container info container_name = “user_stress_data” container_info = griddb.ContainerInfo( container_name, column_info, griddb.ContainerType.COLLECTION, row_key=True ) # Connecting to GridDB and creating the container try: gridstore.put_container(container_info) container = gridstore.get_container(container_name) if container is None: print(f”Failed to create container: {container_name}”) else: print(f”Successfully created container: {container_name}”) except griddb.GSException as e: print(f”Error creating container {container_name}:”) for i in range(e.get_error_stack_size()): print(f”[{i}] Error code: {e.get_error_code(i)}, Message: {e.get_message(i)}”) Output: Successfully created container: user_stress_data The above message shows that the container creation is successful. Insert User Stress Data into GridDB We can now store data in the container we created. To do so, we iterate through the rows in our dataset, convert the column data into a GridDB data type, and store each row in the container using the put() method. The following script inserts our stress detection dataset into the user_stress_data container we created. try: for _, row in dataset.iterrows(): # Prepare each row’s data in the exact order as defined in `column_info` row_data = [ int(row[col]) if dtype == griddb.Type.INTEGER else float(row[col]) if dtype == griddb.Type.DOUBLE else str(row[col]) for col, dtype in column_info ] # Insert the row data into the container container.put(row_data) print(f”Successfully inserted {len(dataset)} rows of data into {container_name}”) except griddb.GSException as e: print(f”Error inserting data into container {container_name}:”) for i in range(e.get_error_stack_size()): print(f”[{i}] Error code: {e.get_error_code(i)}, Message: {e.get_message(i)}”) Output: Successfully inserted 3000 rows of data into user_stress_data The above output shows that data insertion is successful. Stress Detection Using Machine and Deep Learning In this section, we will retrieve the stress detection dataset from the user_stress_data GridDB container we created earlier. Subsequently, we will train machine learning and deep learning models for stress prediction. Retrieving Data from GridDB The following script defines the retrieve_data_from_griddb() function that accepts the container name as a parameter and calls the get_container() function on the grid store to retrieve the data container. Next, we create a SELECT query object and call its fetch() method to retrieve all records from the user_stress_data container. Finally, we call the fetch_rows() function to convert the records into a Pandas DataFrame. def retrieve_data_from_griddb(container_name): try: data_container = gridstore.get_container(container_name) # Query all data from the container query = data_container.query(“select *”) rs = query.fetch() data = rs.fetch_rows() return data except griddb.GSException as e: print(f”Error retrieving data from GridDB: {e.get_message()}”) return None stress_dataset = retrieve_data_from_griddb(“user_stress_data”) stress_dataset.head() Output: The above output shows the stress detection dataset we retrieved from the GridDB container. Predicting User Stress with Machine Learning We will first try to predict the PSS_score using a traditional machine learning algorithm such as Random Forest Regressor. The following script divides the dataset into features and labels, splits it into training and test sets, and normalizes it using the standard scaling approach. X = stress_dataset.drop(columns=[‘PSS_score’]) y = stress_dataset[‘PSS_score’] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) scaler = StandardScaler() X_train = scaler.fit_transform(X_train) X_test = scaler.transform(X_test) Next, we create an object of the RandomForestRegressor class from he Scikit learn library and pass the training and test sets to the fit() method. rf_model = RandomForestRegressor(random_state=42, n_estimators=1000) rf_model.fit(X_train, y_train) Finally, we evaluate the model performance by prediting PSS_score on the test set. rf_predictions = rf_model.predict(X_test) # Evaluate the regression model mae = mean_absolute_error(y_test, rf_predictions) print(f”Mean Absolute Error: {mae:.4f}”) Output: Mean Absolute Error: 7.8973 The output shows that, on average, our model’s predicted PSS_score is off by 7.8 points. This is not so bad, but it is not very good either. Next, we will develop a deep neural network for stress detection prediction. Predicting User Stress with Deep Learning We will use the TensorFlow and Keras libraries to create a sequential deep learning model with three dense layers. We will also add batch normalization and dropout to reduce model overfitting. We will also use an adaptive learning rate so the gradient doesn’t overshoot while training. Finally, we compile the model using the mean squared error loss and mean absolute error metric. We use this loss and metric since we are dealing with a regression problem. model = Sequential([ Dense(128, input_dim=X_train.shape[1], activation=’relu’), BatchNormalization(), Dropout(0.2), Dense(64, activation=’relu’), BatchNormalization(), Dropout(0.2), Dense(32, activation=’relu’), BatchNormalization(), Dropout(0.1), Dense(1) ]) # Adaptive learning rate scheduler with exponential decay initial_learning_rate = 0.001 lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay( initial_learning_rate=initial_learning_rate, decay_steps=10000, decay_rate=0.9 ) # Compile the model with Adam optimizer and a regression loss model.compile(optimizer=Adam(learning_rate=lr_schedule), loss=’mean_squared_error’, metrics=[‘mean_absolute_error’]) # Summary of the model model.summary() Output: The above output shows the model summary. Next, we train the model using the fit() method. We use an early stopping approach that stops model training if the loss doesn’t decrease for 100 consecutive epochs. Finally, we save the best model at the end of model training. # Define callbacks for training early_stopping = EarlyStopping(monitor=’val_loss’, patience=100, restore_best_weights=True) model_checkpoint = ModelCheckpoint(‘best_model.keras’, monitor=’val_loss’, save_best_only=True) # Train the model with the callbacks history = model.fit( X_train, y_train, epochs=1000, batch_size=4, validation_data=(X_test, y_test), callbacks=[early_stopping, model_checkpoint], verbose=1 ) Output: We load the best model to evaluate the performance and use it to make predictions on the test set. # Load the best model best_model = load_model(‘best_model.keras’) # Make predictions on the test set y_pred = best_model.predict(X_test) # Calculate Mean Absolute Error mae = mean_absolute_error(y_test, y_pred) print(f”Mean Absolute Error: {mae:.4f}”) # Plot training history to show MAE over epochs plt.plot(history.history[‘mean_absolute_error’], label=’Train MAE’) plt.plot(history.history[‘val_mean_absolute_error’], label=’Validation MAE’) plt.title(‘Mean Absolute Error over Epochs’) plt.xlabel(‘Epochs’) plt.ylabel(‘MAE’) plt.legend() plt.show() Output: On the test set, we achieved a mean absolute error value of 7.89, similar to what we achieved using the Random Forest Regressor. The results also show that our model is slightly overfitting since the training loss is lower compared to validation loss across the epochs. Conclusion This article is a comprehensive guide to developing a stress detection system using machine learning, deep learning regression models, and the GridDB database. In this article, you explored the process of connecting to GridDB, inserting a stress detection dataset, and utilizing Random Forest and deep neural networks to predict perceived stress scores. The Random Forest and deep learning models performed decently with a manageable mean absolute error. If you have any questions or need assistance with GridDB or machine learning techniques, please ask on Stack Overflow using the griddb tag. Our team is always happy to help. For the complete code, visit my GridDB Blogs GitHub

More
IoT Intrusion Detection

Intrusions refer to unauthorized activities that exploit vulnerabilities in IoT devices, which can compromise sensitive data or disrupt essential services. Detecting these intrusions is crucial to maintaining the security and integrity of IoT networks. This article demonstrates how to develop a robust intrusion detection system for IoT environments using a machine learning and deep learning approach with the GridDB database . We will begin by retrieving IoT intrusion detection data from Kaggle, store it in a GridDB container, and use this data to train machine learning and deep learning models to identify different types of intrusions. GridDB, a high-performance NoSQL database, is particularly suited for handling the large-scale, real-time data generated by IoT systems due to its efficient in-memory processing and time series capabilities. Using GridDB’s powerful IoT data management features along with advanced machine learning and deep learning, we will build a predictive model that identifies potential threats to IoT devices. Note: You can find codes for the tutorial in my GridDB Blogs GitHub repository. Prerequisites You need to install the following libraries to run codes in this article. GridDB C Client GridDB Python client To install these libraries, follow the installation instructions on GridDB Python Package Index (Pypi). Since the code is executed in Google Colab, you do not need to install any other libraries. Run the following script to import required libraries into your Python application. import pandas as pd import seaborn as sns import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.ensemble import RandomForestClassifier from sklearn.metrics import accuracy_score, classification_report from sklearn.preprocessing import LabelEncoder import tensorflow as tf from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Dropout, BatchNormalization from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint from tensorflow.keras.optimizers import Adam from tensorflow.keras.models import load_model import griddb_python as griddb Inserting IoT Data into GridDB In this section, you will see how to download IoT data from kaggle, import it into your Python application and store it in the GridDB database. Along the way you, you will learn to connect your Python application to GridDB, create a GridDB container and insert data into the GridDB container you created. Downloading and Importing the IoT Dataset From Kaggle We will insert the IoT Intrusion Detection dataset from Kaggle into the GridDB. The dataset consists of different types of IoT intrusions. the following script imports the dataset into a Pandas dataframe. # Dataset download link #https://www.kaggle.com/datasets/subhajournal/iotintrusion/data dataset = pd.read_csv(“IoT_Intrusion.csv”) print(f”The dataset consists of {dataset.shape[0]} rows and {dataset.shape[1]} columns”) dataset.head() Output: The above output shows that the dataset consists of 1048575 rows and 47 columns. For the sake of simplicity, we will train our machine learning models on 200k records. The following script randomly selects 200k records from the original dataset. The label column contains the intrusion types. We will also plot the count for each intrusion type. dataset = dataset.sample(n=200000, random_state=42) print(f”The dataset consists of {dataset.shape[0]} rows and {dataset.shape[1]} columns”) print(f”The total number of output labels are {dataset[‘label’].nunique()}”) dataset[‘label’].value_counts() Output: The above output shows all the 34 intrusion types in our dataset with DDoS-ICMP_FLOOD being the most frequently occurring intrusion while Uploading_Attack is the least frequently occurring intrusion type. For simplification’s sake we will group the 34 categories into 9 major categories using the following script. category_map = { ‘DDoS’: [ ‘DDoS-ICMP_Flood’, ‘DDoS-UDP_Flood’, ‘DDoS-TCP_Flood’, ‘DDoS-PSHACK_Flood’, ‘DDoS-SYN_Flood’, ‘DDoS-RSTFINFlood’, ‘DDoS-SynonymousIP_Flood’, ‘DDoS-ICMP_Fragmentation’, ‘DDoS-ACK_Fragmentation’, ‘DDoS-UDP_Fragmentation’, ‘DDoS-HTTP_Flood’, ‘DDoS-SlowLoris’ ], ‘DoS’: [ ‘DoS-UDP_Flood’, ‘DoS-TCP_Flood’, ‘DoS-SYN_Flood’, ‘DoS-HTTP_Flood’ ], ‘Brute Force’: [ ‘DictionaryBruteForce’ ], ‘Spoofing’: [ ‘MITM-ArpSpoofing’, ‘DNS_Spoofing’ ], ‘Recon’: [ ‘Recon-HostDiscovery’, ‘Recon-OSScan’, ‘Recon-PortScan’, ‘Recon-PingSweep’ ], ‘Web-based’: [ ‘SqlInjection’, ‘CommandInjection’, ‘XSS’, ‘BrowserHijacking’, ‘Uploading_Attack’ ], ‘Mirai’: [ ‘Mirai-greeth_flood’, ‘Mirai-udpplain’, ‘Mirai-greip_flood’ ], ‘Other’: [ ‘VulnerabilityScan’, ‘Backdoor_Malware’ ], ‘Benign-trafic’: [ ‘BenignTraffic’ ] } # Reverse the mapping to allow lookup by subcategory subcategory_to_parent = {subcat: parent for parent, subcats in category_map.items() for subcat in subcats} # Add the ‘class’ column using the mapping dataset[‘class’] = dataset[‘label’].map(subcategory_to_parent) dataset[‘class’].value_counts() Output: You can now see that DDoS intrusion is the most frequently occuring intrusion followed by DoS and Mirai. Let’s plot a bar plot for the class distribution. class_counts = dataset[‘class’].value_counts() sns.set(style=”darkgrid”) plt.figure(figsize=(10, 6)) sns.barplot(x=class_counts.index, y=class_counts.values) plt.title(“Class Distribution”) plt.xlabel(“Class”) plt.ylabel(“Count”) plt.xticks(rotation=45) plt.show() Output: The above output shows that our dataset is highly imbalanced. In the next section, we will insert this data into GridDB. Connect to GridDB You need to perform the following steps to connect your Python application to a GridDB instance. Create an instance of the griddb.StoreFactory object using the get_instance() method. Create a GridDB store factory object by calling the get_store() method on the StoreFactory object. You need to pass the GridDB host and cluster name, and the user and password that you use to connect to the GridDB instance. This should establish a connection to your GridDB instance. To test the connection create a container by calling the get_container() method and pass to it a dummy container name. This step is optional and only tests the connection. The following script shows how to connect to GridDB. # GridDB connection details DB_HOST = “127.0.0.1:10001” DB_CLUSTER = “myCluster” DB_USER = “admin” DB_PASS = “admin” # creating a connection factory = griddb.StoreFactory.get_instance() try: gridstore = factory.get_store( notification_member = DB_HOST, cluster_name = DB_CLUSTER, username = DB_USER, password = DB_PASS ) container1 = gridstore.get_container(“container1”) if container1 == None: print(“Container does not exist”) print(“Successfully connected to GridDB”) except griddb.GSException as e: for i in range(e.get_error_stack_size()): print(“[“, i, “]”) print(e.get_error_code(i)) print(e.get_location(i)) print(e.get_message(i)) Output: Container does not exist Successfully connected to GridDB If you see the above output, you have successfully connected to GridB. Create Container for IoT Data in GridDB GridDB stores data in containers. Therefore, you need to create a container to store your IoT data. You will need to perform the following steps to create a GridDb container capable of storing your IoT data. First, check if a container with the name you want to use already exists. If it does, either delete the existing container or choose a different name for your new container. Convert your dataset into a format that the GridDB expects. For example, GridDB expects the dataset to have a unique ID, therefore you need to add unique add if it doesn’t already exist. Similarly GriDB doesn’t accept spaces in column names. You will need to preprocess the column names too. GridDB data types are different than the Pandas dataframe types. You need to Define mapping from Pandas dataframe column types to GridDB data types. You need to create a column info list that contains column names and their corresponding mapped data type. Finally, you need to call the put_container() method on the store factory object and pass it the container name, the column_info list and the container type (griddb.ContainerType.COLLECTION in this case). The row_key is set to true since each row has unique ID. To test if the container is successfully created, call the get_container() method to retrieve the container you created. The following script creates the IoT_Data container in our GridB instance. # drop container if already exists gridstore.drop_container(“IoT_Data”) # Add an primary key column dataset.insert(0, ‘ID’, range(1, len(dataset) + 1)) # Clean column names to remove spaces or forbidden characters in the GridDB container dataset.columns = [col.strip().replace(” “, “_”) for col in dataset.columns] # Mapping from pandas data types to GridDB data types type_mapping = { ‘float64’: griddb.Type.DOUBLE, ‘int64’: griddb.Type.INTEGER, ‘object’: griddb.Type.STRING } # Generate column_info dynamically, adding ID as the first entry column_info = [[“ID”, griddb.Type.INTEGER]] + [ [col, type_mapping[str(dtype)]] for col, dtype in dataset.dtypes.items() if col != “ID” ] # Define the container info with ID as the primary key and as a collection container container_name = “IoT_Data” container_info = griddb.ContainerInfo( container_name, column_info, griddb.ContainerType.COLLECTION, row_key=True ) # Connecting to GridDB and creating the container try: gridstore.put_container(container_info) container = gridstore.get_container(container_name) if container is None: print(f”Failed to create container: {container_name}”) else: print(f”Successfully created container: {container_name}”) except griddb.GSException as e: print(f”Error creating container {container_name}:”) for i in range(e.get_error_stack_size()): print(f”[{i}] Error code: {e.get_error_code(i)}, Message: {e.get_message(i)}”) Insert IoT Data into GridDB We are now ready to insert the dataframe into the GridDB container we just created. To do so, we iterate through all the rows in our dataset, fetch the column data and column type for each row and insert the data using the container.put() method. The following script inserts our IoT data from the Pandas dataframe into the GriDB IoT_Data container. try: for _, row in dataset.iterrows(): # Prepare each row’s data in the exact order as defined in `column_info` row_data = [ int(row[col]) if dtype == griddb.Type.INTEGER else float(row[col]) if dtype == griddb.Type.DOUBLE else str(row[col]) for col, dtype in column_info ] # Insert the row data into the container container.put(row_data) print(f”Successfully inserted {len(dataset)} rows of data into {container_name}”) except griddb.GSException as e: print(f”Error inserting data into container {container_name}:”) for i in range(e.get_error_stack_size()): print(f”[{i}] Error code: {e.get_error_code(i)}, Message: {e.get_message(i)}”) Output: Successfully inserted 200000 rows of data into IoT_Data In the next sections we will retrieve the IoT data from GridDB and will train machine learning and deep learning models for predicting intrusion type. Forecasting IoT Intrusion Using Machine Learning and Deep Learning We will try both machine learning and deep learning approaches for predicting intrusion type. But first we will see how to retrieve data from GridDB. Retrieving Data From GridDB To retrieve GridDB data you have to retrieve the container, call query() method and pass the SELECT * query to the method. This will retrieve all the records from the container. Next, call the fetch() method to retrieve the data from the container. Finally, call the fetch_rows() method to store data in a Pandas dataframe. The following script defines the retrieve_data_from_griddb() method that retrieves data from a GridDB container into a Pandas dataframe. def retrieve_data_from_griddb(container_name): try: data_container = gridstore.get_container(container_name) # Query all data from the container query = data_container.query(“select *”) rs = query.fetch() data = rs.fetch_rows() data .set_index(“ID”, inplace=True) return data except griddb.GSException as e: print(f”Error retrieving data from GridDB: {e.get_message()}”) return None iot_data = retrieve_data_from_griddb(“IoT_Data”) iot_data.head() Output: IoT Data Classification Using Machine Learning We will first use the Random Forest Classification algorithm, a classification machine learning algorithm and predict the intrusion type. To do so, we will divide our dataset into features and labels set and then train and test sets. We will also standardize our dataset. # Separate the features (X) and the output class (y) X = iot_data.drop(columns=[‘label’, ‘class’]) # Dropping both `label` and `class` columns as `class` is the target y = iot_data[‘class’] # Output target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) scaler = StandardScaler() X_train = scaler.fit_transform(X_train) X_test = scaler.transform(X_test) Next, we will use the RandomForestClassifier() class from the sklearn module and call its fit() method to train the algorithm on the training data. rf_model = RandomForestClassifier(random_state=42) rf_model.fit(X_train, y_train) Finally, we can make predictions using the predict() method and compare the predictions with the actual labels in the test set to calculate the model accuracy. rf_predictions = rf_model.predict(X_test) rf_accuracy = accuracy_score(y_test, rf_predictions) classification_rep = classification_report(y_test, rf_predictions, zero_division=1) print(“Classification Report:\n”, classification_rep) Output: The above output shows that our model achieves an accuracy of 99% on the test set. It is important to note that since Brute Force had only 47 instances in the training set, the model is not able to learn much about this category. IoT Data Classification Using Deep Learning Let’s now predict intrusion detection using a deep neural network implemented in TensorFlow Keras library. We will convert the output labels to numeric integers since deep learning algorithms work with numbers only. Next we will standardize the training and test sets as we did before. label_encoder = LabelEncoder() y = label_encoder.fit_transform(iot_data[‘class’]) # Integer encoding X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) X_train = scaler.fit_transform(X_train) X_test = scaler.transform(X_test) The following script defines our model with four hidden layers followed by an output layer. Since we have a multiclass classification problem we use the softmax function in the final activation layer. We also use an adaptive learning rate to avoid overfitting. # Define the model model = Sequential([ Dense(256, input_dim=X_train.shape[1], activation=’relu’), BatchNormalization(), Dropout(0.4), Dense(128, activation=’relu’), BatchNormalization(), Dropout(0.4), Dense(64, activation=’relu’), BatchNormalization(), Dropout(0.3), Dense(32, activation=’relu’), BatchNormalization(), Dropout(0.3), Dense(len(pd.unique(y)), activation=’softmax’) # Softmax for multiclass classification ]) # Adaptive learning rate scheduler with exponential decay initial_learning_rate = 0.001 lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay( initial_learning_rate=initial_learning_rate, decay_steps=10000, decay_rate=0.9 ) # Compile the model with Adam optimizer with decayed learning rate model.compile(optimizer=Adam(learning_rate=lr_schedule), loss=’sparse_categorical_crossentropy’, metrics=[‘accuracy’]) The script trains our deep learning model on the training set. We save the best model after each epoch. # Define callbacks without ReduceLROnPlateau early_stopping = EarlyStopping(monitor=’val_loss’, patience=15, restore_best_weights=True) model_checkpoint = ModelCheckpoint(‘best_model.keras’, monitor=’val_accuracy’, save_best_only=True) # Train the model with the callbacks history = model.fit( X_train, y_train, epochs=100, batch_size=32, validation_data=(X_test, y_test), callbacks=[early_stopping, model_checkpoint], verbose=1 ) Output: Once the training is complete we load the best model and make predictions on the test. We compare predictions with the actual target label to calculate model performance. # Load the best model best_model = load_model(‘best_model.keras’) y_pred = best_model.predict(X_test).argmax(axis=-1) print(“Classification Report:\n”, classification_report(y_test, y_pred, zero_division = 0)) Output: The above output shows that we achieve an accuracy of 89% on the test which is much less than what was achieved using the Random Forest algorithm. The reason can be that tree algorithms such as Random Forest are known to perform better on the tabular dataset. Furthermore, the performance of deep learning models is significantly impacted by data imbalance. Conclusion This article demonstrated a complete workflow for building an IoT intrusion detection system using GridDB for data management and machine learning and deep learning models for classification. We covered how to connect to GridDB, store IoT intrusion data, and use Random Forest and deep neural network models to predict intrusion types. The Random Forest model achieved high accuracy, showing its suitability for tabular datasets, while deep learning models highlighted the challenges of data imbalance. If you have any questions or need help with GridDB, feel free to reach out on Stack Overflow with the griddb tag, and our engineers will be glad to

More
GridDB Cloud on Microsoft Azure Marketplace

This article aims to be seen as supplemental to the original GridDB Cloud Quickstart Guide. In that article, we covered how to sign up for the FREE Trial version of the GridDB Cloud offering and then dived into usage via the GridDB Web API. In this one, we want to cover signing up for GridDB Cloud via the Azure Marketplace. There are two flavors of this: the Pay-As-You-Go Plan, and the Fixed Monthly Shared instance. We will cover signing up, the pricing, and differences with the Free trial version. Usage with the Web API will not be covered as it is identical to our previous efforts; you can also use the GridDB CLI Tool to interface with these Cloud offerings as well. The Plans (and how to pick) Before we get into picking between Pay-As-You-Go and The fixed monthly cost plans, there is a third option: The GridDB Free Plan. I would recommend starting off with the free plan, and when your data starts to exceed the basic limit, you can then sign up for the pay as you go plan and use Toshiba Support to migrate your Free plan data over to the Pay as you go plan! It’s a very simple process. First and foremost, to use GridDB Cloud on Azure, you will need a Microsoft Azure account. You can sign up via their website: https://azure.microsoft.com/en-us/pricing/purchase-options/azure-account. Next, you will choose if you want to sign up for the plan which allows you to pay and scale as needed, or shoot for the fixed monthly commitment plan. Pay-As-You-Go Fixed Monthly Commitment The main difference between these two instances is the pricing — with the monthly commitment you’re paying $520/month, but you won’t need to worry about possibly going over budget if you’re moving lots of data in and out or making tons of requests. The pricing is as follows for the other plan: Service Price Storage $0.002 per 1GB/hour Data Out $0.09 per 1GB Data In $0.0025 per 1MB Request $0.012 per 100 requests So let’s try to do some quick napkin math and see what kind of scenario you’d need to be in to make the shared monthly $520 commitment the right choice. Azure Budgeting Before we dive into some simple math to figure out some thresholds of use for the pay-as-you-go plan, I think it’s also important to point out that Azure Marketplace does have guardrails in place to help control spending. Even if dealing with a massive budget, I implore all users to read and set an Azure Spending Limit, this can and will help ease any concerns over accidentally spending the entire month’s budget in a week — every cloud engineer’s nightmare! Rough Cost Estimations Let’s assume, for a baseline, that you’re storing exactly half of the maximum amount (max is 100GB, so let’s play with 50GB). At the price of $0.002/1GB/hour, that puts us at roughly $72/month on storage costs. Working from here, let’s estimate how many data transfers we need to commit to reach our soft limit. To spend the rest of the $448 budget, let’s take a look at some scenarios. We’d need to commit ~5000GB of data through Data Out to reach our allotment. For Data In, it’s about 175GB (it costs a lot more to write to the Cloud than to read from it!). If we wanted to do an even split between Data In and Data Out ($224 each), it’d be 2488.88GB for Data Out, 87.50GB for Data In. Requests cost are about 1,000,000 for $120, so accounting for that, we’d end up with this kind of scenario: Service Component Details / Volume Cost Storage 50 GB (for 720 hours) $72.00 Requests 1,000,000 requests $120.00 Data Out Approx. 1822.22 GB $164.00 Data In 65,600 MB (or approx. 64.06 GB) $164.00 Total $520.00 If your project grows to this size, it is of course recommended to commit to the $520 plan, but before that, you can comfortably use the Pay-As-You-Go Plan. Signing Up Now that you know which plan best suits you, let’s walk through that process. As mentioned above, you will need an Azure account. And now simply click on the plan you want and click Get It Now. From there, sign in. You will be greeted with a permissions ask. You must accept to continue. And once you accept, you are greeted with the subscribe page Click Subscribe to fill out your normal Azure details like resource group (you can create a new one here) and the name of your service (this is just for internal use, it can be whatever you want). Once you fill it out, just hit Review + subscribe And once Azure provisions your instance, you are not quite done as you will also need to click configure account now, which will link you to GridDB Cloud’s subscription page. Once you sign up there, it will link back to your Azure account. And please note, in this portion, my browser was actively hiding all pop ups from the page and so I encountered an error like so: To get around it, find where your browser is blocking the pop up (it should be immediately easy to see), and then click allow. It will then ask for some Azure -> GridDB Cloud permissions to be accepted. Once accepted, your subscription will begin being fulfilled: And then once finished: And now, in that GridDB Cloud splash page, you will have access to your GridDB Cloud information, such as the management GUI URL, as well as the user credentials. Navigate to your Management GUI URL and enter your credentials. You should be now be logged in and greeted with your new Cloud dashboard Congrats! Next Steps From this point, you have some branching options on what you can do next. First and foremost, I recommend you whitelist your current machine’s IP Address in the cloud dashboard as told in the GridDB Cloud Quickstart Guide: Whitelisting your IP Address and then I’d also recommend following the next step of creating a new db user and granting db access to that user. Next, you have some options: GridDB Cloud Quickstart Guide: WebAPI checkConnection will give the most comprehensive overview of how to use the Web API and what commands are available. This guide has lots of working examples of creating containers/tables, adding data, querying containers, etc. It shows examples using curl, python, and nodejs. It also briefly touches on what the API Endpoints look like and the general structure of how they look. A very good place to start! GridDB CLI Tool is a tool we wrote which helps to interact with the cloud-based dashboard. Because all commands are through HTTP Requests and must include basic authentication and some other headers, this tool aims to simplifiy the process of making these calls with simpler syntax. Also includes interactive container creation and CSV ingestion. Very helpful once familiar with the Cloud usage. How to Utilize GridDB Cloud as the Backend to your No Code Bubble App is a blog which shows you how to make web request calls and utilize GridDB Cloud as your backend for a ‘no code’ frontend solution Pairing GridDB Cloud with Grafana Cloud will showcase pairing GridDB Cloud with another popular third party cloud-based implementation of Grafana. With these two paired together, you can visualize all sorts of intricate data trends — cool! Monitoring Air Quality in California using Home Assistant, Raspberry Pi, and GridDB Cloud showcases using GridDB Cloud in a more local setting. It teaches you how to use GridDB Cloud in your smart home to gather your sensor data and make use of it to send alerts to the inhabitants. Conclusion And with that, we have shown how easy it is to sign up for the new Azure-based GridDB Cloud shared instance and similar of a process usage will be! We also did some rough math to estimate which plan might be best for you and your stage of data; and though the instructions for this process was shown for the Pay-As-You-Go plan, the steps are the exact same for Monthly Commitment

More
Griddb SonarQube Integration

Introduction SonarQube is widely used for continuous code quality inspection, helping teams detect bugs and vulnerabilities early. However, as projects scale, the backend database can become a bottleneck, affecting performance. Integrating GridDB, a high-performance, distributed NoSQL database, with SonarQube enhances its data processing capabilities, providing faster insights and better scalability for large projects. This blog will explore how integrating SonarQube with GridDB can help teams monitor and improve code quality over time. We will provide a step-by-step guide to set up this integration and demonstrate its benefits in achieving a more efficient and high-quality development process. Why Use GridDB with SonarQube? While SonarQube is a powerful tool for static code analysis, it has limitations when it comes to long-term data storage and real-time analytics: Limited Data Retention: SonarQube‘s built-in database may not be optimized for long-term storage of historical code quality data. Performance Bottlenecks: As the volume of data grows, SonarQube‘s performance may degrade, especially when querying historical trends. Lack of Real-time Analytics: SonarQube primarily focuses on static analysis and may not provide real-time insights into code quality metrics. By integrating SonarQube with GridDB, we can: Enhance data retention: Store historical code quality data for long-term analysis. Improve performance: Accelerate query performance and enable real-time insights. Gain deeper insights: Analyze trends, identify recurring issues, and measure the impact of code improvements. Make data-driven decisions: Use historical data to optimize development processes and prioritize code quality initiatives. Setting Up GridDB Cluster and Spring Boot Integration: For Real-Time Monitoring The first step is to set up a GridDB cluster and integrate it with our Spring Boot application as follows. Setting up GridDB Cluster GridDB provides flexible options to meet different requirements. For development, a single-node cluster on our local machine may be sufficient. However, in production, distributed clusters across multiple machines are typically preferred for improved fault tolerance and scalability. For detailed guidance on setting up clusters based on our deployment strategy, refer to the GridDB documentation. To set up a GridDB cluster, follow the steps mentioned here. Setting up SonarCluster We can either use an existing SonarQube instance or, for evaluation purposes, set up a SonarQube instance using Docker as shown below. Pull Image docker run -d –name [SonarQube](https://www.sonarsource.com/) -p 9000:9000 [SonarQube](https://www.sonarsource.com/):latest This will pull the latest SonarQube image and run it on port 9000. You can access the SonarQube dashboard by navigating to http://localhost:9000 in your browser. Access SonarQube: Once the container is running, open a browser and go to http://localhost:9000. The default login credentials are: Username: admin Password: admin Create an API Access Token: Log in to the SonarQube dashboard. Navigate to My Account by clicking on your username at the top right. Go to the Security tab. Under the Tokens section, click on Generate Token. Enter a name for the token (e.g., “API Access”) and click Generate. Copy the generated token as it will not be shown again. Setting up Spring Boot Application Once our GridDB cluster is operational, the next step is connecting it to ourSpring Boot application. The GridDB Java Client API provides the necessary tools to establish this connection. To simplify the process, you can include the griddb-spring-boot-starter library as a dependency in our project, which offers pre-configured beans for a streamlined connection setup. Project Structure Here’s a suggested project structure for such an application: my-griddb-app │ pom.xml │ ├───src │ ├───main │ │ ├───java │ │ │ └───mycode │ │ │ │ MySpringBootApplication.java │ │ │ │ │ │ │ ├───config │ │ │ │ GridDBConfig.java │ │ │ │ │ │ │ ├───controller │ │ │ │ ChartController.java │ │ │ │ │ │ │ ├───dto │ │ │ │ SonarMetricDTO.java │ │ │ │ │ │ │ └───service │ │ │ ChartService.java │ │ │ MetricsCollectionService.java │ │ │ RestTemplateConfig.java │ │ │ │ │ └───resources │ │ │ application.properties │ │ │ │ │ └───templates │ │ charts.html This structure separates controllers, models, repositories, services, and the application entry point into distinct layers, enhancing modularity and maintainability. Add GridDB Dependency To enable interaction with GridDB in our Spring Boot project, we must include the GridDB Java Client API dependency. This can be accomplished by adding the appropriate configuration to the project build file, such as pom.xml for Maven or the equivalent file for Gradle. Here’s an example of how to configure the dependency in thepom.xml file: <project xmlns=”http://maven.apache.org/POM/4.0.0″ xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation=”http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd”> <modelVersion>4.0.0</modelVersion> <groupId>com.example</groupId> <artifactId>my-griddb-app</artifactId> <version>1.0-SNAPSHOT</version> <name>my-griddb-app</name> <url>http://maven.apache.org</url> <parent> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>3.2.4</version> <relativePath /> <!– lookup parent from repository –> </parent> <properties> <maven.compiler.source>17</maven.compiler.source> <maven.compiler.target>17</maven.compiler.target> </properties> <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>3.8.1</version> <scope>test</scope> </dependency> <!– GridDB dependencies –> <dependency> <groupId>com.github.griddb</groupId> <artifactId>gridstore</artifactId> <version>5.6.0</version> </dependency> <!– Spring Boot dependencies –> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> <exclusions> <exclusion> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-logging</artifactId> </exclusion> </exclusions> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-thymeleaf</artifactId> </dependency> <!– JSON processing –> <dependency> <groupId>com.fasterxml.jackson.core</groupId> <artifactId>jackson-databind</artifactId> <version>2.15.0</version> <!– or the latest version –> </dependency> <!– Lombok –> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <optional>true</optional> </dependency> </dependencies> </project> Configure GridDB Connection After adding the GridDB dependency, the next step is configuring the connection details for our GridDB cluster in our Spring Boot application. This is usually configured in the application.properties file, where you can specify various settings for the application. Here’s a quick example of how to set up those connection details: GRIDDB_NOTIFICATION_MEMBER=127.0.0.1:10001 GRIDDB_CLUSTER_NAME=myCluster GRIDDB_USER=admin GRIDDB_PASSWORD=admin management.endpoints.web.exposure.include=* server.port=9090 # [SonarQube](https://www.sonarsource.com/) Configuration sonar.url=http://localhost:9000 sonar.token=squ_06bfcd665f7ca4115d9d230b0fe19f294246c919 sonar.projectKey=Helloworld sonar.metricKeys=coverage,duplicated_lines_density,code_smells,bugs,vulnerabilities,complexity,ncloc,comment_lines,lines,functions,classes,files griddb.cluster.host: The hostname or IP address of ourGridDB cluster. griddb.cluster.port: The port number on which the GridDB cluster is listening. griddb.cluster.user: The username for accessing the GridDB cluster. griddb.cluster.password: The password for the specified GridDB user (replace with ouractual password). server.port=9090: Sets the port on which ourSpring Boot application will run. sonar.projectKey: This property uniquely identifies your project within SonarQube. Create GridDB Client Bean To interact effectively with GridDB in our Spring Boot application,we need to create a dedicated Spring Bean to manage the GridDB connection. This bean will establish the connection using the parameters defined in the application.properties file and will act as the central interface for interacting with the GridDB cluster across the application. Here’s an example of how to define this bean in a Java class named GridDbConfig.java: package mycode.config; import java.util.Properties; import org.springframework.beans.factory.annotation.Value; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.context.annotation.PropertySource; import com.toshiba.mwcloud.gs.GSException; import com.toshiba.mwcloud.gs.GridStore; import com.toshiba.mwcloud.gs.GridStoreFactory; @Configuration @PropertySource(“classpath:application.properties”) public class GridDBConfig { @Value(“${GRIDDB_NOTIFICATION_MEMBER}”)   private String notificationMember; @Value(“${GRIDDB_CLUSTER_NAME}”)   private String clusterName; @Value(“${GRIDDB_USER}”)   private String user; @Value(“${GRIDDB_PASSWORD}”)   private String password; @Bean   public GridStore gridStore() throws GSException {     // Acquiring a GridStore instance     Properties properties = new Properties();     properties.setProperty(“notificationMember”, notificationMember);     properties.setProperty(“clusterName”, clusterName);     properties.setProperty(“user”, user);     properties.setProperty(“password”, password);     return GridStoreFactory.getInstance().getGridStore(properties); } } Metric Collection In this section, we’ll delve into the technical flow of how SonarQube collects code quality metrics and how these metrics can be transferred and stored in GridDB for analysis. 1. Extracting Metrics from SonarQube SonarQube provides a comprehensive set of code quality metrics that can be accessed via its API. These metrics include: Bugs: Issues in the code that could lead to system malfunctions. Code Smells: Pieces of code that may not be erroneous but could negatively affect readability and maintainability. Vulnerabilities: Security-related issues that could make the application susceptible to exploits. Test Coverage: The percentage of code covered by automated tests. Duplications: The extent of redundant code across the project. The process to retrieve these metrics involves making HTTP requests to specific SonarQube API endpoints. For instance: GET /api/measures/component?component=&metricKeys=, API Request Example GET /api/measures/component?component=project_xyz&metricKeys=bugs,coverage API Response The API responds with the metrics in JSON format, typically including a list of measures. For example: { “component”: { “key”: “Helloworld”, “name”: “Helloworld”, “description”: “A simple Spring Boot 2.x app to send hello world message to a user”, “qualifier”: “TRK”, “measures”: [ { “metric”: “duplicated_lines_density”, “value”: “0.0”, “bestValue”: true }, { “metric”: “classes”, “value”: “2” }, { “metric”: “functions”, “value”: “18” }, { “metric”: “lines”, “value”: “211” }, { “metric”: “coverage”, “value”: “0.0”, “bestValue”: false }, { “metric”: “ncloc”, “value”: “160” }, { “metric”: “files”, “value”: “3” }, { “metric”: “vulnerabilities”, “value”: “0”, “bestValue”: true }, { “metric”: “bugs”, “value”: “1”, “bestValue”: false }, { “metric”: “comment_lines”, “value”: “22” }, { “metric”: “complexity”, “value”: “29” }, { “metric”: “code_smells”, “value”: “1”, “bestValue”: false } ] } } The component refers to the project identifier (e.g., “Helloworld), and the measures array contains the actual metrics (e.g., bugs and coverage) along with their corresponding values. 2. Storing Data in GridDB Once the metrics are fetched from SonarQube, they need to be transformed to fit the schema required by GridDB. GridDB is a time-series database, and its data model focuses on storing time-stamped records efficiently. Therefore, the SonarQube metrics need to be transformed into time-series data that includes the following fields: Timestamp: The time at which the metric was collected (e.g., 2024-11-17T12:00:00Z). Metric Name: The type of metric (e.g., bugs, coverage). Metric Value: The value of the metric (e.g., 5, 80.5). Component: The identifier of the project (e.g., project_xyz). A typical transformation flow would include: Parsing the API response into individual metric records. Creating SonarMetricDTO objects to hold the data in the appropriate format, aligning with the GridDB schema. package mycode.dto; import lombok.AllArgsConstructor; import lombok.Data; import lombok.NoArgsConstructor; import java.util.Date; import com.toshiba.mwcloud.gs.RowKey; @Data @NoArgsConstructor @AllArgsConstructor public class SonarMetricDTO { @RowKey public Date timestamp; // Time of the activity private String metricName; // e.g., coverage, bugs private String metricValue; // e.g., 80.5, 2 private String component; // Project key (optional, for identification) } merged, conflict) } Here, the timestamp is set to the current date and time, and the metric name and value are extracted from the SonarQube API response. Data Injection Finally, the transformed data will be loaded into GridDB, where it can be accessed for real-time monitoring and analysis of developer activities. We will insert the data into our database as follows:                             TimeSeries ts = store.putTimeSeries(“sonarMetrics”, SonarMetricDTO.class); for (SonarMetricDTO metric : metrics) { ts.append(metric); }Below is the complete `MetricsCollectionService.java`, which implements all the aforementioned steps. Data Representation in GridDB Once the data is stored, GridDB organizes it into time-series entries. The structure is typically as follows: Timestamp            Metric Name Metric Value Component 2024-11-17T12:00:00Z bugs         5             project_xyz 2024-11-17T12:00:00Z coverage    80.5          project_xyz This enables efficient querying of time-series data, supporting real-time analysis and historical tracking of code quality metrics. Here is the complete code for MetricsCollectionService package mycode.service; import java.util.ArrayList; import java.util.Base64; import java.util.Date; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.beans.factory.annotation.Value; import org.springframework.http.HttpEntity; import org.springframework.http.HttpHeaders; import org.springframework.http.HttpMethod; import org.springframework.http.ResponseEntity; import org.springframework.scheduling.annotation.Scheduled; import org.springframework.stereotype.Service; import org.springframework.web.client.RestTemplate; import com.fasterxml.jackson.core.JsonProcessingException; import com.fasterxml.jackson.databind.JsonMappingException; import com.fasterxml.jackson.databind.JsonNode; import com.fasterxml.jackson.databind.ObjectMapper; import com.toshiba.mwcloud.gs.*; import mycode.dto.SonarMetricDTO; import java.text.ParseException; import java.util.List; @Service public class MetricsCollectionService { @Value(“${sonar.url}”) private String sonarUrl; // e.g., http://localhost:9000 @Value(“${sonar.token}”) private String sonarToken; // [SonarQube](https://www.sonarsource.com/) authentication token @Value(“${sonar.projectKey}”) private String defaultProjectKey; // Default project key from properties @Value(“#{‘${sonar.metricKeys}’.split(‘,’)}”) private List defaultMetricKeys; // Default metric keys as a list @Autowired GridStore store; @Scheduled(fixedRate = 5000) /** * Fetches metrics from [SonarQube](https://www.sonarsource.com/), converts them to DTOs, and saves them. */ public void collectMetrics() throws GSException, JsonMappingException, JsonProcessingException, ParseException { // Step 1: Authenticate and Fetch Data from [SonarQube](https://www.sonarsource.com/) // Step 1: Authenticate and Fetch Data from [SonarQube](https://www.sonarsource.com/) List metrics = fetchMetricsFromSonar(defaultProjectKey, defaultMetricKeys); // Step 2: Save the metrics into GridDB TimeSeries ts = store.putTimeSeries(“sonarMetrics”, SonarMetricDTO.class); for (SonarMetricDTO metric : metrics) { ts.append(metric); } } private List fetchMetricsFromSonar(String projectKey, List metricKeys) { String metricsQuery = String.join(“,”, metricKeys); // Create CSV of metric keys String apiUrl = sonarUrl + “/api/measures/component?component=” + projectKey + “&metricKeys=” + metricsQuery; RestTemplate restTemplate = new RestTemplate(); HttpHeaders headers = new HttpHeaders(); headers.set(“Authorization”, “Basic ” + Base64.getEncoder().encodeToString((sonarToken + “:”).getBytes())); HttpEntity entity = new HttpEntity(headers); ResponseEntity response = restTemplate.exchange(apiUrl, HttpMethod.GET, entity, String.class); // Step 1.1: Parse the JSON Response return parseSonarResponse(response.getBody(), projectKey); } private List parseSonarResponse(String responseBody, String projectKey) { List metrics = new ArrayList(); try { // Parse the JSON response using Jackson ObjectMapper objectMapper = new ObjectMapper(); JsonNode rootNode = objectMapper.readTree(responseBody); // Access the “measures” array within the response JsonNode measuresNode = rootNode.path(“component”).path(“measures”); // Iterate through the “measures” array and extract the metrics for (JsonNode measure : measuresNode) { String metricName = measure.path(“metric”).asText(); String value = measure.path(“value”).asText(); // Create and add the metric to the list metrics.add(createMetric(metricName, value, projectKey)); } } catch (Exception e) { e.printStackTrace(); // Handle the exception (e.g., log it) } return metrics; } /** * Helper to create a DTO instance. */ private SonarMetricDTO createMetric(String metricName, String value, String projectKey) { SonarMetricDTO dto = new SonarMetricDTO(); dto.setMetricName(metricName); dto.setMetricValue(value + ((int) (Math.random() * 9) + 1)); dto.setComponent(projectKey); dto.setTimestamp(new Date()); // Use current timestamp return dto; } } By following above steps, we can effectively extract data from SonarQube, load it into GridDB. Data Querying in GridDB and Visualization with Thymeleaf Once the data is stored and available in GridDB, the next step is to visualize this data in a way that provides actionable insights. In this section, we’ll explore how to build a dashboard using Spring Boot, Thymeleaf, and Chart.js to render charts that display commits and pull request trends over time. Here are the steps to achieve this: Building the Chart Controller The ChartController acts as the intermediary between backend data in GridDB and the frontend visualizations displayed on the dashboard. Its responsibilities include handling HTTP requests, interacting with the service layer to fetch data, and passing that data to Thymeleaf templates for rendering. Here’s how the ChartController is implemented: package mycode.controller; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Controller; import org.springframework.ui.Model; import org.springframework.web.bind.annotation.GetMapping; import com.fasterxml.jackson.databind.ObjectMapper; import mycode.service.ChartService; import mycode.dto.SonarMetricDTO; import java.text.SimpleDateFormat; import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.stream.Collectors; @Controller public class ChartController { @Autowired ChartService chartService; @Autowired private ObjectMapper objectMapper; @GetMapping(“/charts”) public String showCharts(Model model) { try { List events = chartService.getVcsEvents(); Map data1 = prepareClassesMetrics(events); Map data2 = prepareVulnerabilitiesMetrics(events); // Convert Maps to JSON Strings for use in JavaScript in the Thymeleaf template String classesDataJson = objectMapper.writeValueAsString(data1); String vulnerabilityDataJson = objectMapper.writeValueAsString(data2); model.addAttribute(“classesDataJson”, classesDataJson); model.addAttribute(“vulnerabilityDataJson”, vulnerabilityDataJson); } catch (Exception e) { e.printStackTrace(); } return “charts”; } private Map prepareClassesMetrics(List events) { Map commitMap = new HashMap(); SimpleDateFormat dateFormat = new SimpleDateFormat(“yyyy-MM-dd’T’HH:mm:ss.SSS’Z'”); for (SonarMetricDTO event : events) { if (“classes”.equals(event.getMetricName())) { String timestamp = dateFormat.format(event.getTimestamp()); commitMap.put(timestamp, Integer.valueOf(event.getComponent())); } } return commitMap; } private Map prepareVulnerabilitiesMetrics(List events) { Map prMap = new HashMap(); SimpleDateFormat dateFormat = new SimpleDateFormat(“yyyy-MM-dd’T’HH:mm:ss.SSS’Z'”); for (SonarMetricDTO event : events) { if (“vulnerabilities”.equals(event.getMetricName())) { String timestamp = dateFormat.format(event.getTimestamp()); prMap.put(timestamp, Integer.valueOf(event.getComponent())); } } return prMap; } } Implementing the Chart Service The ChartService acts as the business logic layer, encapsulating the operations needed to query GridDB and process the results. The ChartService class retrieves Sonar code quality metrics from a GridStore database container named “sonarMetrics”. The service processes each row of the result, mapping fields like Timestamp, MetricName, MetricValue, and Component into a SonarMetricDTO object. Finally, it returns a list of these DTOs representing the Sonar code quality metrics. Here’s how the ChartService is implemented: package mycode.service; import java.text.SimpleDateFormat; import java.util.ArrayList; import java.util.Date; import java.util.List; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Service; import com.toshiba.mwcloud.gs.Container; import com.toshiba.mwcloud.gs.GridStore; import com.toshiba.mwcloud.gs.Query; import com.toshiba.mwcloud.gs.Row; import com.toshiba.mwcloud.gs.RowSet; import mycode.dto.SonarMetricDTO; @Service public class ChartService { @Autowired GridStore store; public List getVcsEvents() throws Exception {

More
Introducing the GridDB Cloud CLI Tool

We have already written a Quick Start Guide on how to use GridDB Cloud. And though we believe it’s simple enough to get started using GridDB Cloud’s WebAPI, we wanted to make some of the simple commands usable from the CLI without having to make CURL requests which include your authentication headers in every command. Enter the GridDB Cloud CLI tool: GitHub The GridDB Cloud CLI Tool aims to make managing your GridDB Cloud database a little more manageable from the comfort of your own terminal! Tasks like querying, pushing data, creating containers, etc can all be accomplished now in your CLI with the help of this tool. In this article, we will walk through how to install and use this tool and show some examples of what you can accomplish with it. Getting Started (Download & Configuration) The CLI Tool is distributed via github as a single binary file. In the release section, you can download the appropriate version for your machine. Once downloaded, you can insert it in a directory in your PATH for your CLI and use from anywhere in the CLI, or alternatively, you can simply use the binary file from within the location it’s located (ie. ./griddb-cloud-cli help). The tool is written in Go, so you could also clone the repo and build your own binary: $ go get $ go build Configuration This tool expects a .yaml file to exist in $HOME/.griddb.yaml with the following fields: cloud_url: “url” cloud_username: “example” cloud_pass: “pass” Alternatively, you save the file elsewhere and include the –config flag when running your tool (ie. griddb-cloud-cli –config /opt/configs/griddb.yaml checkConnection). You will also still need to whitelist your IP Address in the GridDB Cloud Portal. Unfortunately this is not something that is achievable through the CLI Tool at this time. Features & Commands This tool was written with the help of the ever-popular Cobra Library. Because of this, we are able to use the –help flag for all the commands in case you forget the functionality of some of the commands and their flags. $ griddb-cloud-cli help A series of commands to help you manage your cloud-based DB. Standouts include creating a container and graphing one using ‘read graph’ and ‘create’ respectfully Usage: griddb-cloud-cli [command] Available Commands: checkConnection Test your Connection with GridDB Cloud completion Generate the autocompletion script for the specified shell create Interactive walkthrough to create a container delete Test your Connection with GridDB Cloud help Help about any command ingest Ingest a `csv` file to a new or existing container list Get a list of all of the containers put Interactive walkthrough to push a row read Query container with TQL show get container info sql Run a sql command Flags: –config string config file (default is $HOME/.griddb.yaml) -h, –help help for griddb-cloud-cli So with that out of the way, let’s begin with the commands. All GridDB CLI Tool Commands On your first time around, you should run the checkConnection command as a sanity check to ensure that you can connect to your instance. The tool will tell you if you have improper auth or if you’re blocked by the firewall: Check Connection $ griddb-cloud-cli checkConnection [10005:TXN_AUTH_FAILED] (address=172.25.23.68:10001, partitionId=0) 2025/04/30 08:32:33 Authentication Error. Please check your username and password in your config file $ griddb-cloud-cli checkConnection 2025/04/30 08:33:48 (403) IP Connection Error. Is this IP Address Whitelisted? Please consider whitelisting Ip Address: X.X.X.116 $ griddb-cloud-cli checkConnection 2025/04/30 08:35:20 Please set a config file with the –config flag or set one in the default location $HOME/.griddb.yaml And if everything is settled correctly: $ griddb-cloud-cli checkConnection 200 OK List Containers You can list all containers inside of your Cloud DB Instance: $ griddb-cloud-cli list 0: actual_reading_1 1: actual_reading_10 2: boiler_control_10 3: device1 4: device2 5: device3 6: device4 7: device6 Show Container You can display the schema and other info about an individual container: $ griddb-cloud-cli show device2 { “container_name”: “device2”, “container_type”: “TIME_SERIES”, “rowkey”: true, “columns”: [ { “name”: “ts”, “type”: “TIMESTAMP”, “timePrecision”: “MILLISECOND”, “index”: [] }, { “name”: “device”, “type”: “STRING”, “index”: [] }, { “name”: “co”, “type”: “DOUBLE”, “index”: [] }, { “name”: “humidity”, “type”: “FLOAT”, “index”: [] }, { “name”: “light”, “type”: “BOOL”, “index”: [] }, { “name”: “lpg”, “type”: “DOUBLE”, “index”: [] }, { “name”: “motion”, “type”: “BOOL”, “index”: [] }, { “name”: “smoke”, “type”: “DOUBLE”, “index”: [] }, { “name”: “temperature”, “type”: “DOUBLE”, “index”: [] } ] } Querying/Reading a Container You can run TQL or SQL queries on your containers. TQL is the simpler option: $ griddb-cloud-cli read device2 –limit 1 –pretty [ { “name”: “device2”, “stmt”: “select * limit 1”, “columns”: null, “hasPartialExecution”: true }] [ [ { “Name”: “ts”, “Type”: “TIMESTAMP”, “Value”: “2006-01-02T07:04:05.700Z” }, { “Name”: “device”, “Type”: “STRING”, “Value”: “b8:27:eb:bf:9d:51” }, { “Name”: “co”, “Type”: “DOUBLE”, “Value”: 0.004955938648391245 }, { “Name”: “humidity”, “Type”: “FLOAT”, “Value”: 51 }, { “Name”: “light”, “Type”: “BOOL”, “Value”: false }, { “Name”: “lpg”, “Type”: “DOUBLE”, “Value”: 0.00765082227055719 }, { “Name”: “motion”, “Type”: “BOOL”, “Value”: false }, { “Name”: “smoke”, “Type”: “DOUBLE”, “Value”: 0.02041127012241292 }, { “Name”: “temperature”, “Type”: “DOUBLE”, “Value”: 22.7 } ] ] The read command will run a simple TQL query of your container which you can then specify the following: an offset (–offset), a limit (-l, –limit), pretty print(-p, –pretty), just rows (–rows), which columns you want to see (–columns) or just the straight obj delivered from GridDB Cloud (–raw). Normally when you query a container with GridDB Cloud, it will send your results as two arrays, one with your column object, and another with more arrays of just row data. You can query this with –raw, but the default is to make a JSON and send that unstructured. If you use Pretty like above, it will indent and space it out for you. Just printing rows is better if you querying lots of rows: $ griddb-cloud-cli read device1 –limit 25 –rows [ { “name”: “device1”, “stmt”: “select * limit 25”, “columns”: null, “hasPartialExecution”: true }] ts,co,humidity,light,lpg,motion,smoke,temp, [2020-07-12T01:00:25.984Z 0.0041795988 77.5999984741 true 0.006763671 false 0.0178934842 26.8999996185] [2020-07-12T01:00:53.485Z 0.0048128545 53.5 false 0.0074903843 false 0.0199543908 21.7] [2020-07-12T01:01:35.020Z 0.0030488793 74.9000015259 true 0.0053836916 false 0.014022829 19.5] [2020-07-12T01:01:52.751Z 0.0049817187 51.3 false 0.0076795919 false 0.020493267 22.4] [2020-07-12T01:01:59.191Z 0.003937408 72.9000015259 true 0.006477819 false 0.0170868731 24.7999992371] [2020-07-12T01:02:01.157Z 0.0050077601 51.1 false 0.0077086115 false 0.0205759974 22.4] [2020-07-12T01:02:01.445Z 0.0030841269 74.8000030518 true 0.0054286446 false 0.0141479363 19.6000003815] [2020-07-12T01:02:04.938Z 0.0048169262 53.5 false 0.0074949679 false 0.0199674343 21.7] [2020-07-12T01:02:05.182Z 0.0025840714 75.5999984741 false 0.0047765452 false 0.0123403139 19.6000003815] [2020-07-12T01:02:12.428Z 0.0030488793 74.9000015259 true 0.0053836916 false 0.014022829 19.6000003815] [2020-07-12T01:02:16.506Z 0.0048277855 53.5 false 0.0075071874 false 0.0200022097 21.7] [2020-07-12T01:02:19.376Z 0.0030401715 74.9000015259 true 0.005372564 false 0.0139918711 19.6000003815] [2020-07-12T01:02:21.754Z 0.0041428371 77.5999984741 true 0.0067205832 false 0.0177717486 26.8999996185] [2020-07-12T01:02:29.017Z 0.0048400659 53.5 false 0.0075209965 false 0.0200415141 21.7] [2020-07-12T01:02:33.443Z 0.0042300404 77.5999984741 true 0.0068226226 false 0.0180601254 26.7999992371] [2020-07-12T01:02:35.686Z 0.00255591 75.5999984741 false 0.0047388314 false 0.0122362642 19.6000003815] [2020-07-12T01:02:41.697Z 0.0030488793 75 true 0.0053836916 false 0.014022829 19.6000003815] [2020-07-12T01:03:03.206Z 0.0042019019 77.5999984741 true 0.006789761 false 0.0179672218 26.7999992371] [2020-07-12T01:03:04.701Z 0.0049946711 51.3 false 0.0076940309 false 0.0205344276 22.5] [2020-07-12T01:03:04.768Z 0.0040601528 72.6999969482 true 0.0066232815 false 0.0174970393 24.7999992371] [2020-07-12T01:03:05.999Z 0.0040886168 77.5 true 0.0066568388 false 0.0175917499 26.7999992371] [2020-07-12T01:03:08.403Z 0.0048101357 53.7 false 0.0074873232 false 0.0199456799 21.8] [2020-07-12T01:03:08.942Z 0.0049860142 51.1 false 0.0076843815 false 0.02050692 22.4] [2020-07-12T01:03:10.023Z 0.0048141805 53.5 false 0.0074918772 false 0.0199586389 21.7] [2020-07-12T01:03:12.863Z 0.0050019251 51.1 false 0.0077021129 false 0.020557469 22.3] Querying Number Data into an ASCII Line Chart Using a subcommand of read, you can also run a TQL query and read the results into a graph. For example: $ griddb-cloud-cli read graph device1 -l 10 –columns temp,humidity [ { “name”: “device1”, “stmt”: “select * limit 10”, “columns”: [“temp”,”humidity”], “hasPartialExecution”: true }] 77.60 ┼╮ 75.66 ┤╰╮ â•­â•® â•­â•® ╭─────────── 73.73 ┤ â•°â•® ╭╯╰╮ â•­â•® ╭╯╰╮ ╭╯ 71.79 ┤ â•°â•® ╭╯ │ ╭╯╰╮ │ â•°â•® ╭╯ 69.85 ┤ â•°â•® ╭╯ â•°â•® ╭╯ â•°â•® ╭╯ â•°â•® ╭╯ 67.92 ┤ │ ╭╯ â•°â•® ╭╯ â•°â•® ╭╯ â•°â•® ╭╯ 65.98 ┤ â•°â•® ╭╯ â•°â•® ╭╯ â•°â•® ╭╯ â•°â•® ╭╯ 64.04 ┤ â•°â•® ╭╯ â•°â•® ╭╯ â•°â•® ╭╯ â•°â•® ╭╯ 62.11 ┤ â•°â•® ╭╯ â•°â•® ╭╯ â•°â•® ╭╯ â•°â•® ╭╯ 60.17 ┤ â•°â•® ╭╯ â•°â•® ╭╯ │ ╭╯ â•°â•® ╭╯ 58.23 ┤ â•°â•® ╭╯ â•°â•® ╭╯ â•°â•® ╭╯ â•°â•® ╭╯ 56.30 ┤ ╰╮╭╯ â•°â•® ╭╯ â•°â•® ╭╯ ╰╮╭╯ 54.36 ┤ ╰╯ ╰╮╭╯ â•°â•® │ ╰╯ 52.42 ┤ ││ ╰╮╭╯ 50.49 ┤ ╰╯ ╰╯ 48.55 ┤ 46.61 ┤ 44.68 ┤ 42.74 ┤ 40.80 ┤ 38.87 ┤ 36.93 ┤ 34.99 ┤ 33.06 ┤ 31.12 ┤ 29.18 ┤ 27.25 ┼─╮ 25.31 ┤ ╰───╮ ╭────╮ 23.37 ┤ ╰───╮ ╭────────╯ ╰────────╮ 21.44 ┤ ╰───────╮ ╭──────╯ ╰───────╮ ╭──────────────╮ 19.50 ┤ ╰───────╯ ╰─────╯ ╰────────────── Col names from container device1 â–  temp â–  humidity The results are color-coded so that you can accurately see which cols are mapped to which values. It also automatically omits non-number types if you just want to read the entire container a line chart: $ griddb-cloud-cli read graph device1 -l 5 Column ts (of type TIMESTAMP ) is not a `number` type. Omitting Column light (of type BOOL ) is not a `number` type. Omitting Column motion (of type BOOL ) is not a `number` type. Omitting 77.60 ┼─╮ 75.01 ┤ ╰─╮ ╭─╮ 72.43 ┤ ╰──╮ ╭──╯ ╰──╮ ╭── 69.84 ┤ ╰──╮ ╭──╯ ╰──╮ ╭─╯ 67.25 ┤ ╰─╮ ╭──╯ ╰─╮ ╭──╯ 64.67 ┤ ╰──╮ ╭──╯ ╰──╮ ╭──╯ 62.08 ┤ ╰──╮ ╭──╯ ╰──╮ ╭──╯ 59.49 ┤ ╰─╮ ╭──╯ ╰──╮ ╭──╯ 56.91 ┤ ╰──╮ ╭──╯ ╰─╮ ╭──╯ 54.32 ┤ ╰────╯ ╰──╮ ╭──╯ 51.73 ┤ ╰───╯ 49.15 ┤ 46.56 ┤ 43.97 ┤ 41.39 ┤ 38.80 ┤ 36.21 ┤ 33.63 ┤ 31.04 ┤ 28.46 ┤ 25.87 ┼───────────╮ ╭── 23.28 ┤ ╰───────────╮ ╭─────────────────────────╯ 20.70 ┤ ╰──────────────────────────────────────────────╯ 18.11 ┤ 15.52 ┤ 12.94 ┤ 10.35 ┤ 7.76 ┤ 5.18 ┤ 2.59 ┤ 0.00 ┼─────────────────────────────────────────────────────────────────────────────────────────────────── Col names from container device1 â–  co â–  humidity â–  lpg â–  smoke â–  temp Creating Containers You can create containers using an interactive question prompt in the CLI. It will ask for container name, container type, rowkey, and col names and types. For example, let’s create a new time series container with two columns: $ griddb-cloud-cli create ✔ Container Name: … sample1 ✔ Choose: … TIME_SERIES ✔ How Many Columns for this Container? … 2 ✔ Col name For col #1 … ts ✔ Col #1(TIMESTAMP CONTAINERS ARE LOCKED TO TIMESTAMP FOR THEIR ROWKEY) … TIMESTAMP ✔ Col name For col #2 … temp ✔ Column Type for col #2 … DOUBLE ✔ Make Container? { “container_name”: “sample1”, “container_type”: “TIME_SERIES”, “rowkey”: true, “columns”: [ { “name”: “ts”, “type”: “TIMESTAMP”, “index”: null }, { “name”: “temp”, “type”: “DOUBLE”, “index”: null } ] } … YES {“container_name”:”sample1″,”container_type”:”TIME_SERIES”,”rowkey”:true,”columns”:[{“name”:”ts”,”type”:”TIMESTAMP”,”index”:null},{“name”:”temp”,”type”:”DOUBLE”,”index”:null}]} 201 Created If you can’t easily follow along with the prompt here, please just download the tool and try it for yourself! And note, as explained in the prompts, if you select to create a TIME_SERIES Container, the rowkey is auto set to true and the first col must have a type of TIMESTAMP. Collection containers have different rules. Putting Rows to Containers Similarly, you can follow along with the prompt to push data into your container, 1 by 1. Here we will push to our new container sample1 and use NOW() as our current timestamp: $ griddb-cloud-cli put sample1 Container Name: sample1 ✔ Column 1 of 2 Column Name: ts Column Type: TIMESTAMP … NOW() ✔ Column 2 of 2 Column Name: temp Column Type: DOUBLE … 20.2 [[“2025-04-30T07:43:03.700Z”, 20.2]] ✔ Add the Following to container sample1? … YES 200 OK Ingesting CSV Data You can also ingest full CSV files with this tool. It too uses an interactive prompt as there is information that needs to be set for each col, such as index position in CSV and data type. Once you set those, it will ingest the data in chunks of 1000. $ griddb-cloud-cli ingest iot_telemetry_data.csv ✔ Does this container already exist? … NO Use CSV Header names as your GridDB Container Col names? ts,device,co,humidity,light,lpg,motion,smoke,temp ✔ Y/n … YES ✔ Container Name: … device6 ✔ Choose: … TIME_SERIES ✔ Col ts(TIMESTAMP CONTAINERS ARE LOCKED TO TIMESTAMP FOR THEIR ROWKEY) … TIMESTAMP ✔ (device) Column Type … STRING ✔ (co) Column Type … DOUBLE ✔ (humidity) Column Type … DOUBLE ✔ (light) Column Type … BOOL ✔ (lpg) Column Type … DOUBLE ✔ (motion) Column Type … BOOL ✔ (smoke) Column Type … DOUBLE ✔ (temp) Column Type … DOUBLE }, { “name”: “device”, “type”: “STRING”, “index”: null }, { “name”: “co”, “type”: “DOUBLE”, “index”: null }, { “name”: “humidity”, “type”: “DOUBLE”, “index”: null }, { “name”: “light”, “type”: “BOOL”, “index”: null }, { “name”: “lpg”, “type”: “DOUBLE”, “index”: null }, { “name”: “motion”, “type”: “BOOL”, “index”: null }, { “name”: “smoke”, “type”: “DOUBLE”, “index”: null }, { “name”: “temp”, “type”: “DOUBLE”, “index”: null } ] } … YES {“container_name”:”device6″,”container_type”:”TIME_SERIES”,”rowkey”:true,”columns”:[{“name”:”ts”,”type”:”TIMESTAMP”,”index”:null},{“name”:”device”,”type”:”STRING”,”index”:null},{“name”:”co”,”type”:”DOUBLE”,”index”:null},{“name”:”humidity”,”type”:”DOUBLE”,”index”:null},{“name”:”light”,”type”:”BOOL”,”index”:null},{“name”:”lpg”,”type”:”DOUBLE”,”index”:null},{“name”:”motion”,”type”:”BOOL”,”index”:null},{“name”:”smoke”,”type”:”DOUBLE”,”index”:null},{“name”:”temp”,”type”:”DOUBLE”,”index”:null}]} 201 Created Container Created. Starting Ingest 0 ts ts 1 device device 2 co co 3 humidity humidity 4 light light 5 lpg lpg 6 motion motion 7 smoke smoke 8 temp temp ✔ Is the above mapping correct? … YES Ingesting. Please wait… Inserting 1000 rows 200 OK Inserting 1000 rows 200 OK Inserting 1000 rows Notice here, in this example, it asks if the container exists in your DB yet. If you select NO, it will create the container for you as shown above. But if you select YES, it will allow you to pick the container from your list and then map the proper indices, and then ingest that way — handy! SQL Commands Sometimes you will need to use SQL because its flexibility and for its ability to use and manipulate partitioned tables. There are three subcommands you can use which follow the sql command: create, update, query. Let’s walk through each one (and yes, they are exactly what they sound like). As a note, you will need to include the -s string with every command (it stands for string, it just represents the raw sql string). First, let’s create a new partitioned table: griddb-cloud-cli sql query -s $ griddb-cloud-cli sql create -s “CREATE TABLE IF NOT EXISTS pyIntPart1 (date TIMESTAMP NOT NULL PRIMARY KEY, value STRING) WITH (expiration_type=’PARTITION’,expiration_time=10,expiration_time_unit=’DAY’) PARTITION BY RANGE (date) EVERY (5, DAY);” [{“stmt”: “CREATE TABLE IF NOT EXISTS pyIntPart1 (date TIMESTAMP NOT NULL PRIMARY KEY, value STRING) WITH (expiration_type=’PARTITION’,expiration_time=10,expiration_time_unit=’DAY’) PARTITION BY RANGE (date) EVERY (5, DAY);” }] Now we have our table. Now let’s push some data into it: griddb-cloud-cli sql update -s $ griddb-cloud-cli sql update -s “INSERT INTO pyIntPart2(date, value) VALUES (NOW(), ‘fourth’)” [{“stmt”: “INSERT INTO pyIntPart2(date, value) VALUES (NOW(), ‘fourth’)” }] [{“updatedRows”:1,”status”:1,”message”:null,”stmt”:”INSERT INTO pyIntPart2(date, value) VALUES (NOW(), ‘fourth’)”}] And then read from it: griddb-cloud-cli sql query -s $ griddb-cloud-cli sql query -s “select * from pyIntPart2 limit 1” –pretty [{“stmt”: “select * from pyIntPart2 limit 1” }] [ [ { “Name”: “date”, “Type”: “TIMESTAMP”, “Value”: “2025-04-30T14:58:00.255Z” }, { “Name”: “value”, “Type”: “STRING”, “Value”: “fourth” } ] ] And as explained above, the read command uses TQL under the hood, which does not have access to partitioned tables, so your use of read will fail on this particular table: $ griddb-cloud-cli read pyIntPart2 2025/04/30 09:09:41 400 Error: [151001:TQ_SYNTAX_ERROR_EXECUTION] Partial/Distribute TQL does not support order by and selection expression except for ‘*’ (address=172.25.23.69:10001, partitionId=27) (containerName=pyIntPart2) Conclusion We hope that the GridDB Cloud CLI tool will be helpful and we hope this article showcased its strengths adequately! And of course, because this tool is completely open source, we encourage users to tinker and expand the current suite of available features. Some of the stuff that may be coming from us: JSON-based table creation, pushing rows without interactive mode, & many

More
Building a One-Time Token Login System with Spring Security

Imagine trying to keep track of all your passwords. It’s a daunting task, isn’t it? With passwords required for social media, online shopping, e-wallet apps, and various computer tools, it’s easy to lose count. Forgetting a password can be frustrating, and having to create a new one every time can be a hassle. Fortunately, there’s a better and safer way to log in to websites: One-Time Tokens. Unlike traditional passwords, which are used repeatedly, One-Time Tokens provide a unique code that can only be used once. This token is designed to be used within a short timeframe, and once it’s used or expires, it becomes invalid. It’s like a secure, self-destructing message. So, why are One-Time Tokens a better option than traditional passwords? Here are a few key benefits: Enhanced Security: By reducing our reliance on passwords, we minimize the risk of password-related vulnerabilities. This means no more weak passwords, no more password reuse across different sites, and even if someone intercepts a token, it’s likely to be expired and useless. Improve User Experience: Let’s face it, remembering passwords can be a pain. One-Time Tokens simplify the process, allowing users to click a link or enter a short code, making login a smoother experience. Fewer Password Reset: With One-Time Tokens, the need for password resets decreases significantly. Since users don’t have to constantly remember and re-enter passwords, there’s less to forget in the first place. Let’s dive into the world of One-Time Tokens login with Spring Security! We will explore what they are and how we can actually implement them. One-Time Token Login: The Flow You navigate to the login page and enter your email address to start the process. The application generates a unique, one-time token. This token is a long, random string that’s impossible to guess. The system sends this token to you via email, SMS, or WhatsApp. You receive the token and enter it on the login page. Often, you’ll just click a magic link in the message that contains the token. The web application receives the token and checks: Is the token valid? (Does it match one we generated?) Has it expired? (One-time tokens have a short lifespan for security) Has it already been used? (Remember, it’s one-time use only!) If everything checks out, you’re logged in. The web application establishes a session for you. 💻 Let’s Build This Thing! Maven To enable the One-Time Token feature, we need to include the spring-boot-starter-security and spring-boot-starter-web dependencies. <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-security</artifactId> </dependency> </dependencies> Spring Security Configuration Next, we configure the Spring Security and enable the form-login and One-Time Token Login: @Configuration @EnableWebSecurity public class SecurityConfig { @Bean public SecurityFilterChain securityFilterChain( HttpSecurity http, SendLinkOneTimeTokenGenerationSuccessHandler successHandler, CustomOneTimeTokenService customOneTimeTokenService) throws Exception { AuthenticationSuccessHandler ottLoginsuccessHandler = (request, response, authentication) -> response.sendRedirect(“/”); http.csrf(Customizer.withDefaults()) .authorizeHttpRequests( authorize -> authorize .requestMatchers(“/error”, “/”, “/images/**”, “/js/*.js”, “/css/*.css”) .permitAll() .requestMatchers(new AntPathRequestMatcher(“/authentication/login”)) .permitAll() .requestMatchers(new AntPathRequestMatcher(“/logout”)) .permitAll() .requestMatchers(new AntPathRequestMatcher(“/webjars/**”)) .permitAll() .requestMatchers(new AntPathRequestMatcher(“/ott/sent”)) .permitAll() .requestMatchers(new AntPathRequestMatcher(“/ott/submit”)) .permitAll() .anyRequest() .authenticated()) .formLogin( form -> form.loginPage(“/authentication/login”) .loginProcessingUrl(“/login”) .failureUrl(“/authentication/login?failed”) .defaultSuccessUrl(“/”) .permitAll()) .headers( httpSecurityHeaders -> httpSecurityHeaders.frameOptions(FrameOptionsConfig::disable)) .logout(Customizer.withDefaults()) .oneTimeTokenLogin( configurer -> configurer .tokenGenerationSuccessHandler(successHandler) .tokenService(customOneTimeTokenService) .showDefaultSubmitPage(false) .authenticationSuccessHandler(ottLoginsuccessHandler)); return http.build(); } } @EnableWebSecurity annotation: enable Spring Security’s web security support and provide the Spring MVC integration. SecurityFilterChain bean to add custom filter in Spring Security Context. Configures authorizeHttpRequests defines which URL path should be secured and which should not. Configures formLogin() to customize the form based authentication. Configures loginPage() for redirecting to /authentication/login if authentication is required. Configures loginProcessingUrl to validate the submitted credentials. failureUrl specify the URL to send users if authentication fails. Configures headers(). We enable all the default headers except the X-Frame-Options headers. .logout(Customizer.withDefaults()) provides logout support using default settings. The default is that accessing the URL “/logout” will log the user out by invalidating the HTTP Session, cleaning up any rememberMe authentication that was configured, clearing the SecurityContextHolder, and then redirect to /login?success. Enable One-Time Token Login support and customize it with oneTimeTokenLogin() method : – tokenGenerationSuccessHandler : Specifies strategy to be used to handle generated one-time tokens. We will create a custom handler that implements OneTimeTokenGenerationSuccessHandler. – tokenService : Configures the OneTimeTokenService used to generate and consume the OneTimeToken. – showDefaultSubmitPage(false) : disable the default One-Time Token submit page. – authenticationSuccessHandler : Specifies the AuthenticationSuccessHandler strategy used to handle a successful user authentication. For demo, we redirect user to the home page. Custom OneTimeTokenGenerationSuccessHandler Next, we need to implement a custom OneTimeTokenGenerationSuccessHandler to deliver the token to the end user. @Component public class SendLinkOneTimeTokenGenerationSuccessHandler implements OneTimeTokenGenerationSuccessHandler { private final OttEmailService emailService; private final FlashMapManager flashMapManager = new SessionFlashMapManager(); public SendLinkOneTimeTokenGenerationSuccessHandler(OttEmailService emailService) { this.emailService = emailService; } @Override @SneakyThrows public void handle( HttpServletRequest request, HttpServletResponse response, OneTimeToken oneTimeToken) throws IOException, ServletException { UriComponentsBuilder builder = UriComponentsBuilder.fromUriString(UrlUtils.buildFullRequestUrl(request)) .replacePath(request.getContextPath()) .replaceQuery(null) .fragment(null) .path(“/ott/submit”) .queryParam(“token”, oneTimeToken.getTokenValue()); String link = builder.toUriString(); CompletableFuture.runAsync(() -> emailService.sendEmail(oneTimeToken.getUsername(), link)); RedirectView redirectView = new RedirectView(“/ott/sent”); redirectView.setExposeModelAttributes(false); FlashMap flashMap = new FlashMap(); flashMap.put(“token”, oneTimeToken.getTokenValue()); flashMap.put(“ottSubmitUrl”, link); flashMapManager.saveOutputFlashMap(flashMap, request, response); redirectView.render(flashMap, request, response); } } This component will do a number of things: Generate the magic link containing the one-time token. Call emailService.sendEmail() to send the email to the user with magic link. For demo, we redirect user to the /ott/sent page. Custom Success Page Create a controller and an HTML template to handle this page. For demo purpose, we forward the token to the custom submit page. @Controller @RequestMapping(“/ott”) public class OttController { @GetMapping(“/sent”) public String sent(Model model) { return “ott/sent”; } @GetMapping(“/submit”) public String submit(Model model, @RequestParam(“token”) String token) { model.addAttribute(“token”, token); return “ott/submit”; } } sent.html <html xmlns:th=”http://www.thymeleaf.org” xmlns:layout=”http://www.ultraq.net.nz/thymeleaf/layout” layout:decorate=”~{layout}”> <head> <title>OTT Sent</title> </head> <body> <div layout:fragment=”content”> <p>We just sent you an email. Please follow the provided link to log in.</p> <p>For testing here is the <a th:href=”${ottSubmitUrl}”>submit link</a></p> </div> </body> </html> submit.html <body> <div layout:fragment=”content”> <div class=”d-flex flex-wrap mb-4″> <h1 class=”flex-grow-1″>Login OTT</h1> </div> <form th:action=”@{/login/ott}” method=”post”> <div class=”row mb-3″> <label for=”token” class=”form-check-label”>Token</label> <input type=”text” id=”token” name=”token” th:value=”${token}” placeholder=”Token” required=”true” autofocus=”autofocus” class=”form-control”/> </div> <button class=”btn btn-primary” type=”submit”>Sign in</button> </form> </div> </body> Custom OneTimeTokenService Next, create a custom implementation of OneTimeTokenService interface. By customizing it, we can have a custom expire time, adding more info into token, and implement custom token value. @Service public class CustomOneTimeTokenService implements OneTimeTokenService { private final Map<String, OneTimeToken> oneTimeTokens = new ConcurrentHashMap<>(); private Clock clock = Clock.systemUTC(); @Override @NonNull public OneTimeToken generate(GenerateOneTimeTokenRequest request) { String token = UUID.randomUUID().toString(); Instant expiresAt = this.clock.instant().plus(5, ChronoUnit.MINUTES); OneTimeToken oneTimeToken = new DefaultOneTimeToken(token, request.getUsername(), expiresAt); oneTimeTokens.put(token, oneTimeToken); return oneTimeToken; } @Override @Nullable public OneTimeToken consume(OneTimeTokenAuthenticationToken authenticationToken) { log.info(“Consume token: {}”, authenticationToken.getTokenValue()); OneTimeToken oneTimeToken = oneTimeTokens.remove(authenticationToken.getTokenValue()); if (oneTimeToken == null || isExpired(oneTimeToken)) { return null; } return oneTimeToken; } private boolean isExpired(OneTimeToken oneTimeToken) { return this.clock.instant().isAfter(oneTimeToken.getExpiresAt()); } } CustomOneTimeTokenService class responsible for: Generating the token and storing it. For demo, we use in memory store. Consuming the token, we validate the token expiration and delete the token from the store. Retrieving Users Create custom UserDetailsService implementation to load user-specific data. @Service @AllArgsConstructor public class CustomUserDetailService implements UserDetailsService { private final UsersContainerClient usersContainerClient; @Override public UserDetails loadUserByUsername(String email) { UserRecord user = usersContainerClient.getUserByEmail(email); if (user == null) { throw new UsernameNotFoundException(“User not found”); } List<SimpleGrantedAuthority> authorities = new java.util.ArrayList<>(); authorities.add(new SimpleGrantedAuthority(“ROLE_USER”)); if (“admin@example.com”.equals(user.email())) { authorities.add(new SimpleGrantedAuthority(“ROLE_ADMIN”)); } return new org.springframework.security.core.userdetails.User( user.email(), user.password(), authorities); } } The user object is stored and fetched using the UsersContainerClient class, which will handle the interaction with GridDB Cloud. @Service public class UsersContainerClient { private static final String CONTAINER_NAME = “Users”; private final RestClient restClient; private final String baseUrl; public UsersContainerClient( @Value(“${griddb.base-url}”) String baseUrl, @Value(“${griddb.auth-token}”) String authToken) { this.baseUrl = baseUrl; this.restClient = RestClient.builder() .baseUrl(this.baseUrl) .defaultHeader(“Authorization”, “Basic ” + authToken) .defaultHeader(“Content-Type”, MediaType.APPLICATION_JSON_VALUE) .build(); } private <T> T post(String uri, Object body, Class<T> responseType) { try { return restClient.post().uri(uri).body(body).retrieve().body(responseType); } catch (GridDbException e) { throw e; } catch (Exception e) { throw new GridDbException( “Failed to execute POST request”, HttpStatusCode.valueOf(500), e.getMessage(), e); } } public UserRecord getUserByEmail(String email) { String statement = String.format(“SELECT id, email, name, \”password\” FROM Users where email == ‘%s'”, email); return getOneUser(statement); } private UserRecord getOneUser(String statement) { String type = “sql-select”; GridDbCloudSQLSelectInput input = new GridDbCloudSQLSelectInput(type, statement); var response = post(“/sql”, List.of(input), GridDbCloudSQLOutPut[].class); log.info(“Output: {}”, response[0]); if (response[0].results().size() == 0) { return null; } UserRecord foundUser = null; for (List<String> row : response[0].results()) { if (row.size() < 4) { break; } foundUser = new UserRecord(row.get(0), row.get(1), row.get(2), row.get(3)); } log.info(“Found user: {}”, foundUser); return foundUser; } } record GridDbCloudSQLSelectInput(String type, @JsonProperty(“stmt”) String statement) {} record GridDbCloudSQLOutPut( @JsonProperty(“columns”) List<GDCColumnInfo> columns, @JsonProperty(“results”) List<List<String>> results, @JsonProperty(“responseSizeByte”) long responseSizeByte) {} To communicate with GridDB Cloud via HTTP requests, we create a Spring RestClient instance with HTTP basic authentication. We POST the sql-select query and convert the response into UserRecord Demo For demo, we have added demo users (admin@example.com, user@example.com) on application startup. Full code can be access on Github. Conclusion One-time tokens are a great leap forward in enhancing online security while keeping things user-friendly. Using frameworks like Spring Security can make it easier to implement these advanced security measures. When using one-time tokens in production, keep these key factors in mind: – Token Validity: Decide how long each token should stay active. – Delivery Reliability: Ensure your token delivery method is dependable. – Security: Make sure the token generation process is cryptographically secure. – Storage Safety: Store tokens securely to prevent unauthorized access. By addressing these aspects, you can create a robust and user-friendly security

More
Building a Scheduling Assistants with SpringAI

In this blog post, we’ll walk through how to build a personal AI assistant that simplifies managing your calendar. By the end, you’ll know how to create an assistant capable of handling event submissions and retrieving schedules through simple conversations. We’ll use Spring Boot, Spring AI, and OpenAI to build a system that’s both practical and enjoyable to interact with. Why Build a Personal AI Calendar Assistant? Managing tasks through natural language might seem like something straight out of a sci-fi movie, but it’s more useful than you might expect. This AI assistant can save you time, eliminate the hassle of manual input, and make managing your schedule a breeze. Additionally, building this project is a fantastic way to sharpen your skills as a developer. If you’re a computer science student or an aspiring developer, you’ll gain valuable hands-on experience with AI integration, backend development, and database management, all while creating a tool you can use in your daily life. System Overview Before diving into the details of coding, let’s take a moment to understand how the entire system is structured. Here’s a high-level overview of how everything works: This application features a single page to display the event list and includes a chatbot interface for user interaction. User Interaction via Chat The chatbot interface allows users to interact with the AI assistant using natural language commands. For example: Submit Events: Add events by chatting with the assistant. For example, you could say, “I want to go to the Shopping Center Tomorrow at 2 PM.“ List Events: Check your schedule by asking, “Show my events for tomorrow“ The AI assistant processes these commands by understanding the user’s queries, extracting critical details such as intent, time, and location, and then performing the appropriate action—like saving the event or retrieving a list of upcoming events. Backend System (Spring Boot) The backend serves as the engine of the system, handling several key tasks: API Handling: Receives user input from the chatbot interface. Event Management: Manages the storage and retrieval of events from the database. Spring AI: Manages the AI logic and communicates with the OpenAI API. AI Module (Spring AI + OpenAI API) This module functions as the brain of the assistant. Here’s how it operates: Input Parsing: The AI module processes user queries and leverages the OpenAI API to extract key details such as the event title, time, and location. Intent Recognition: Determines the user’s intention, whether it’s adding an event or listing upcoming events. Response Generation: Produces a user-friendly response based on the action performed. Spring AI acts as a wrapper around the OpenAI API, streamlining the integration process and allowing you to focus on core application logic instead of implementation complexities. Data Storage (Database Layer) The database layer ensures that all events are securely stored and can be retrieved when needed. Here’s what happens at this level: Event Storage: Stores each event submitted through the chatbot. Query: Fetches relevant events from the database when the user requests their schedule. For this project, we’ll use GridDB as our database solution. Now that we’ve covered the system architecture, let’s get started with building the application! Step-by-Step Guide to Building the Project The following items should be installed in your system: Java 17 or later: OpenJDK Maven Your preferred IDE: VS Code, Initellij IDEA Docker Compose OpenAI API We need to create an API Key with OpenAI to access ChatGPT models. Create an account and generate the token on the API Keys page. Initialize a Spring Boot Project You can use this pre-initialized project and click Generate to download a Zip file. You can also fork the project from Github and open it in your IDE or other editor. Spring AI Dependency Add Milestone and Snapshot Repositories <repositories> <repository> <id>spring-milestones</id> <name>Spring Milestones</name> <url>https://repo.spring.io/milestone</url> <snapshots> <enabled>false</enabled> </snapshots> </repository> <repository> <id>spring-snapshots</id> <name>Spring Snapshots</name> <url>https://repo.spring.io/snapshot</url> <releases> <enabled>false</enabled> </releases> </repository> </repositories> Add Spring AI Bill of Materials (BOM) <dependencyManagement> <dependencies> <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-bom</artifactId> <version>1.0.0-SNAPSHOT</version> <type>pom</type> <scope>import</scope> </dependency> </dependencies> </dependencyManagement> Add SpringAI OpenAI Spring Boot starter <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-openai-spring-boot-starter</artifactId> </dependency> Add GridDB dependency <dependency> <groupId>com.github.griddb</groupId> <artifactId>gridstore</artifactId> <version>5.6.0</version> </dependency> Storing and Managing Events In this project we have a simple calendar system with two main entities: User and Event. Each event is associated with a specific user. Based on the schema above, we will create the entity classes as follows: @Data public class User { @RowKey String id; String email; String fullName; } @Data public class Event { @RowKey private String id; private String title; private String location; private Date startTime; private Date endTime; private String userId; } Next, we create the GridDBConfig class as a central configuration for database operation. The class will do the following: Read environment variables for connecting to the GridDB database Create a GridStore class for managing database connection to the GridDB instance Create GridDB Collection’s container (Table) to manage a set of rows. The container is a rough equivalent of the table in a relational database. On creating/updating the Collection we specify the name and object corresponding to the column layout of the collection. Also for each collection, we add an index for a column that is frequently searched and used in the condition of the WHERE section of TQL. Make the container available in the Spring container @Configuration public class GridDBConfig { @Value(“${GRIDDB_NOTIFICATION_MEMBER}”) private String notificationMember; @Value(“${GRIDDB_CLUSTER_NAME}”) private String clusterName; @Value(“${GRIDDB_USER}”) private String user; @Value(“${GRIDDB_PASSWORD}”) private String password; @Bean public GridStore gridStore() throws GSException { Properties properties = new Properties(); properties.setProperty(“notificationMember”, notificationMember); properties.setProperty(“clusterName”, clusterName); properties.setProperty(“user”, user); properties.setProperty(“password”, password); GridStore store = GridStoreFactory.getInstance().getGridStore(properties); return store; } @Bean public Collection<String, User> userCollection(GridStore gridStore) throws GSException { Collection<String, User> collection = gridStore.putCollection(AppConstant.USERS_CONTAINER, User.class); collection.createIndex(“email”); return collection; } @Bean public Collection<String, Event> eventCollection(GridStore gridStore) throws GSException { Collection<String, Event> movieCollection = gridStore.putCollection(AppConstant.EVENT_CONTAINER, Event.class); movieCollection.createIndex(“userId”); return movieCollection; } } Business Logic EventService class This service class handles event creation and listing. @Slf4j @Service public class EventService { private final Collection<String, Event> eventCollection; private final Collection<String, User> userCollection; public EventService(Collection<String, Event> eventCollection, Collection<String, User> userCollection) { this.eventCollection = eventCollection; this.userCollection = userCollection; } public List<EventDTO> findAll(String userId) { if (userId != null && !userId.isBlank()) { return fetchAll(userId).stream().map(event -> mapToDTO(event, new EventDTO())).toList(); } final List<Event> events = fetchAll(); return events.stream().map(event -> mapToDTO(event, new EventDTO())).toList(); } public String create(final EventDTO eventDTO, String userId) { final Event event = new Event(); mapToEntity(eventDTO, event); event.setUserId(userId); event.setId(IdGenerator.next(“ev_”)); try { eventCollection.put(event); return event.getId(); } catch (GSException e) { throw new AppErrorException(“Failed to create event”); } } } UserService class This class handles user creation. @Slf4j @Service public class UserService { private final Collection<String, User> userCollection; public UserService(Collection<String, User> userCollection) { this.userCollection = userCollection; } public Optional<User> findByEmail(final String emailString) { try (Query<User> query = userCollection.query(“SELECT * WHERE email='” + emailString + “‘”, User.class)) { RowSet<User> rowSet = query.fetch(); if (rowSet.hasNext()) { User user = rowSet.next(); return Optional.of(user); } else { throw new NotFoundException(“User not found”); } } catch (GSException e) { throw new AppErrorException(“Failed to find user”); } } Connecting OpenAI To connect to OpenAI’s API, we need to configure the API key and specify the name of the OpenAI model for accessing the LLM. This configuration is done in the application.yml file: spring: ai: openai: api-key: ${OPENAI_API_KEY} chat: options: model: gpt-4o-mini Here, ${OPENAI_API_KEY} retrieves the API key from an environment variable. For this project, we are using the gpt-4o-mini model. Initialize the Spring AI ChatClient Below is the implementation of the PersonalAssistant class, which initializes the ChatClient, processes user queries, and sends them to the OpenAI API. @Service public class PersonalAssistant { private final ChatClient chatClient; public PersonalAssistant(ChatClient.Builder modelBuilder, ChatMemory chatMemory) { // @formatter:off this.chatClient = modelBuilder.defaultSystem(“”” You are a personal assistant and travel planner. Your job is to answer questions about and to perform actions on the user’s behalf, mainly around calendar events, and time-management. You are required to answer an a professional manner. If you don’t know the answer, politely tell the user you don’t know the answer, then ask the user a followup question to try and clarify the question they are asking. If you do know the answer, provide the answer but do not provide any additional followup questions. Use the provided functions to fetch user’s events by email, and create new event. Before creating new event, you MUST always get the following information from the user: 1. Email 2. Location 3. Start time 4. End time: If not provided, assume it ended in one hour. 5. Title: Get title from user’s intent and interest. Today is {current_date}. “””) .defaultAdvisors( new MessageChatMemoryAdvisor(chatMemory, DEFAULT_CHAT_MEMORY_CONVERSATION_ID, 10), new SimpleLoggerAdvisor() ) .defaultFunctions(“getUserEvents”, “createEvent”) .build(); // @formatter:on } public String chat(String chatId, String userMessageContent) { return this.chatClient.prompt() .system(s -> s.param(“current_date”, LocalDate.now().toString())) .user(userMessageContent) .call().content(); } } We obtain an auto-configured ChatClient.Builder and use it to create the ChatClient. The ChatClient is a Spring Bean provided by Spring AI that manages sending user input to the LLM. To make our chatbot focus on functioning as a personal assistant and avoid providing irrelevant information, we utilize a system message to guide the model’s behavior and specify the desired output. This system message is defined within the defaultSystem() method. We add chat memory to maintain context for up to 10 previous messages when using the chatbot, ensuring more cohesive interactions. We include a SimpleLoggerAdvisor to log request and response data from the ChatClient, which is helpful for debugging and monitoring AI interactions. We register the getUserEvents() and createEvent() functions to enable the LLM to interact with existing business logic. The chat() method accepts a user message, passes it to the Spring AI ChatClient bean as input, and returns the result from the content(). Function Calling Here’s how function calling works in this project: The user types something like, Give me my schedule for tomorrow. Spring AI connects to the OpenAI API, processes the text, and extracts the required information. Using function calling, the AI model dynamically determines which function to trigger. Spring AI executes the relevant function with the extracted parameters (e.g., getUserEvents()). Spring AI calls the OpenAI API again, including the function’s response, to generate the final reply. Now, let’s map our functions so we can use them with Spring AI. @Configuration public class EventTools { private static final Logger logger = LoggerFactory.getLogger(EventTools.class); @Autowired private EventService eventService; @Autowired private UserService userService; public record EventListRequest(String email) {} public record EventViewDTO(String id, String title, String location, LocalDateTime startTime, LocalDateTime endTime, UserViewDTO user) {} public record UserViewDTO(String name) {} @Bean @Description(“Get event list for given users email”) public Function<EventListRequest, List<EventViewDTO>> getUserEvents() { return request -> { Optional<User> user = userService.findByEmail(request.email()); return eventService.findAll(user.get().getEmail()).stream().map(this::mapToViewDTO).toList(); }; } private EventViewDTO mapToViewDTO(EventDTO eventDTO) { return new EventViewDTO(eventDTO.getId(), eventDTO.getTitle(), eventDTO.getLocation(), eventDTO.getStartTime(), eventDTO.getEndTime(), new UserViewDTO(eventDTO.getUser().name())); } public record CreateEventRequest(String email, String title, String location, LocalDateTime startTime, LocalDateTime endTime) { } @Bean @Description(“Create new event with specified email, title, location, start-time, and end-time.”) public Function<CreateEventRequest, String> createEvent() { return request -> { logger.debug(“call function create event {}”, request); Optional<User> user = userService.findByEmail(request.email()); EventDTO eventDTO = new EventDTO(); eventDTO.setTitle(request.title()); eventDTO.setLocation(request.location()); eventDTO.setStartTime(request.startTime()); eventDTO.setEndTime(request.endTime()); return eventService.create(eventDTO, user.get().getId()); }; } } Define a @Bean method that returns a java.util.function.Function. Add the @Description annotation to provide a clear explanation of what this function does. Spring AI can leverage the service classes we’ve already developed without requiring a complete rewrite. Chat Interface The chatbox UI is developed using Thymeleaf, Javascript, and CSS. The chatbox is designed to resemble message bubbles, similar to iMessage, and supports using the Enter key to send messages. We use AJAX to handle HTTP requests and responses seamlessly. Running the Project with Docker Compose To spin up the project we will utilize Docker Compose. The entire code for the web application is available on Github. Before starting the application, make sure you have the API Key from OpenAI. Create .env file with the following content: OPENAI_API_KEY=’YOUR_OPENAI_API_KEY’ Build the services: docker compose build Start the services: docker compose up After starting the application it is accessible under localhost:8080. Conclusion Spring AI makes it easier to add AI features to Spring-based applications. It allows AI code to work alongside existing business logic in the same codebase. What can be improved? Add logs for chatbox messages (input and output). Make it easy for users to give feedback on chatbox responses. Implement safety measures like

More
Pushing Time Series Data to GridDB Cloud’s Time Series Containers with Kafka HTTP Sink

In a previous article, we showcased how one could pair GridDB Cloud’s free infrastructure with Kafka using a custom Single Message Transform and some SSL certs/rules; you can read that article here: Pushing Data to GridDB Cloud with Kafka HTTP Sink Connector. In this article, we will expand on those efforts and add timestamp data types into the mix. By the time you finish this article, you should be able to understand how you can stream data from some source over to GridDB Cloud, with the added benefit of being able to push to time series containers which take timestamps as their rowkey (a must!) As stated above, the big addition for this article is the handling of time series data and pushing it out into the GridDB Cloud. There were two things that had to be learned in order to get this project to work: chaining together Single Message Transforms, and learning the exact time format the GridDB WebAPI accepts as acceptable for time series data; there was also a minuscule change made to the SMT we used in the previous article. Prereqs This article is part II, and therefore a continuation of a previous effort; in part I, we go over the fundamentals of what this project is and how it works This means that understanding part I of this series is a pseudo-prerequisite for this article but is not necessarily required. In any case, the prereqs for both of these articles are the same: A Kafka system/cluster running (docker is the easiest way) The source code for the custom Single Message Transform (or the pre-compiled .jar) The GridDB Web API Sink connector (just connection details to make it yourself) The source code (and all of the required configs/yamls) can found on the GridDB.net github page: $ git clone https://github.com/griddbnet/Blogs.git –branch kafka_http_timeseries Implementation Most of the code implementation for this project was done in the previous effort, but there are still some changes we need to make to the existing code base. Mostly though, we will be using an existing Single Message Transform to be able to send time series data to GridDB Cloud. The way it works is this: an SMT allows for to transforming the Kafka records before it gets sent over to your Kafka sink. It also allows for using multiple SMTs (executed in order) before the data gets sent out. For our purposes, we are just using the right side of the diagram. The topic flows through to the sink, gets transformed (twice in this case!) and then out to our GridDB Cloud installation. The photo is credited to confluent. Chaining Single Message Transforms In part I of this series, we used our custom SMT to decouple the values from the field names from our Kafka record and form it into a nested array, which is the only data struct that a PUT to GridDB Cloud accepts. Using just this alone, we were able to successfully push data to a GridDB Collection container. However, when dealing with time series containers, an issue arises because the WebAPI expects a very specific data format for the time series data column. If your data is in milliseconds since epoch, for example, the GridDB WebAPI will not accept that as a valid time column type and will reject the HTTP Request. According to the docs, the format expected by GridDB WebAPI is this: YYYY-MM-DDThh:mm:ss.SSSZ (ie. “2016-01-16T10:25:00.253Z”). So, before we transform our data to extract the values and create our nested array, we can run a Single Message Transform on just the ts column, transform whatever the value is into the format it likes, and then run the process of building our nested array. Using this flow allows for us to push data successfully but to also transform the timestamp column into the exact format expected. And please remember, the order of your transforms matter! “transforms.timestamp.type”: “org.apache.kafka.connect.transforms.TimestampConverter$Value”, “transforms.timestamp.target.type”: “string”, “transforms.timestamp.field”: “ts”, “transforms.timestamp.format”: “yyyy-MM-dd’\”T’\”HH:mm:ss.SSS’\”Z’\” “, “transforms.nestedList.type”: “net.griddb.GridDBWebAPITransform$Value”, “transforms.nestedList.fields”: “ts”, Here you see we target the ts column and we explicitly state the format we expect. One small gotcha is that you must wrap the T and Z characters in single quotes otherwise Kafka will reject the format as illegal. And of course, if you deviate from this format at all, you will be rejected by the GridDB Cloud — ouch! Handling Strings Sent to GridDB WebAPI Now that we’ve got our SMTs in place, there’s one more ‘gotcha’ to investigate. The GridDB Web API expects the timestamp to be wrapped in double quotes, and so we need to make a small change to our SMT from part I of this article: Object val = fPath.valueFrom(value); String valType = val.getClass().getName(); if (valType.contains(“String”)) { val = “\”” + val + “\””; row.add(val); } else { row.add(val); } Luckily for us, the WebAPI expects all strings to be wrapped in double quotes, so we don’t need to do any explicit checking if the value is a timestamp or not, we just need to check if the value is a string. Once we have this settled, fire up the connector (you can run the script inside of the scripts/ dir just please make the necessary changes before you do so) and then create some topics. Creating Topics with Schemas Using the Control Center Now that we’ve got our infrastructure in place, let’s run it! First, please make sure the GridDB Cloud URL you’re using points to a real container already in place in your db. In my case, I made a time series container called kafka_ts and gave it a schema of: ts (timestamp), data (float), temp (float). This container is already being pointed to in the URL of my sink connector. With that out of the way, let’s make our topic and schema. If you used the script to create the connector, your topic may be named topic_griddb_cloud, so head into your Kafka control-center (located in http://localhost:9021) and create a new topic. From the Schema tab, you can copy and paste the following schema: { “connect.name”: “net.griddb.webapi.griddb”, “connect.parameters”: { “io.confluent.connect.avro.field.doc.data”: “The string is a unicode character sequence.”, “io.confluent.connect.avro.field.doc.temp”: “The double type is a double precision (64-bit) IEEE 754 floating-point number.”, “io.confluent.connect.avro.field.doc.ts”: “The int type is a 32-bit signed integer.”, “io.confluent.connect.avro.record.doc”: “Sample schema to help you get started.” }, “doc”: “Sample schema to help you get started.”, “fields”: [ { “doc”: “The int type is a 32-bit signed integer.”, “name”: “ts”, “type”: “string” }, { “doc”: “The double type is a double precision (64-bit) IEEE 754 floating-point number.”, “name”: “temp”, “type”: “double” }, { “doc”: “The string is a unicode character sequence.”, “name”: “data”, “type”: “double” } ], “name”: “griddb”, “namespace”: “net.griddb.webapi”, “type”: “record” } Once created, from the messages tab, produce a new message like so: { “ts”: “2025-03-13T18:00:00.032Z”, “data”: 23.2, “temp”: 43.23 } If all goes well, your sink should still be running and you should have a new row of data inside of your container — cool!! Troubleshooting While preparing for this article, I had lots of issues getting everything to run properly, despite the results showing how relatively simple it is. There are two reasons for that: 1, debugging was pretty obtuse as the logs are extremely difficult to follow, and 2, the schema rules are extremely finicky and must be precise for Kafka to follow through on streaming data (which is a good thing!). So, if you are encountering issues, I recommend first changing the log level of your Kafka cluster’s connect container from just “WARN” (default) to either DEBUG or TRACE. I’d start with DEBUG and move up if necessary as the TRACE logs move extremely quickly and are difficult to read. You can change the log level with Docker by adding some environment variables and doing a hard reset of your containers. Add this to the bottom of your Kafka connect docker-compose section #docker-compose.yml connect: image: cnfldemos/cp-server-connect-datagen:0.6.4-7.6.0 hostname: connect container_name: connect depends_on: – broker – schema-registry ports: – “8083:8083” – “2929:2929” – “443:443” environment: CONNECT_LOG4J_ROOT_LOGLEVEL: DEBUG # Or TRACE for even more detail CONNECT_LOG4J_LOGGERS: org.apache.kafka.connect=DEBUG,io.confluent.connect=DEBUG $ docker compose up -d –force-recreate And once it’s printing out the enormous amounts of logs, you can narrow down what you’re searching for using grep $ dockers logs -f connect | grep Caused Caused by: org.apache.kafka.connect.errors.DataException: Failed to deserialize data for topic topic_griddb_cloud to Avro: Caused by: org.apache.kafka.common.errors.SerializationException: Unknown magic byte! I’ve found Caused to be the best way to debug the issues with the connectors, but you can try searching for the topic name, the connector name, or maybe your URL endpoint. Another thing you can do is to modify the SMT code and print messages from there to observe how the SMT is handling your records. Conclusion And now we can successfully push our kafka data directly into GridDB Time Series Containers on the

More