{"id":46753,"date":"2023-03-15T00:00:00","date_gmt":"2023-03-15T07:00:00","guid":{"rendered":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/"},"modified":"2025-11-13T12:56:30","modified_gmt":"2025-11-13T20:56:30","slug":"ingest-and-query-a-gene-expression-dataset-in-r-with-griddb","status":"publish","type":"post","link":"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/","title":{"rendered":"Ingest and Query a Gene Expression Dataset in R with GridDB"},"content":{"rendered":"<h2>Introduction<\/h2>\n<p>In this document, we&#8217;ll first go over some notes about setting up GridDB and connecting to it from R. Then, we&#8217;ll ingest some gene expression data from The Cancer Genome Atlas (TCGA), and query the GridDB backend using <code>dplyr<\/code> to generate simple summary statistics.<\/p>\n<h2>Prerequisites<\/h2>\n<p>You can follow along by cloning the source code from here: https:\/\/github.com\/griddbnet\/Blogs\/tree\/gene_analysis<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-sh\">git clone https:\/\/github.com\/griddbnet\/Blogs.git --branch gene_analysis<\/code><\/pre>\n<\/div>\n<h3>Installing and starting GridDB<\/h3>\n<p>To install GridDB we simply follow the instructions provided in the <a href=\"https:\/\/docs.griddb.net\/latest\/gettingstarted\/using-apt\/\">documentation<\/a>. On an Ubuntu machine, the easiest is to download the <code>deb<\/code> package and install with <code>dpkg -i<\/code>. Thereafter, we need to start the GridDB service with <code>sudo systemctl start gridstore<\/code>. After this, we are ready to connect to the service from R in order to load and query data.<\/p>\n<h3>Installing R dependencies<\/h3>\n<p>Running the code below would require installation (if not already available) of several R packages. Below, I am using the <code>attachment<\/code> package to detect the dependencies used by this Quarto markdown document.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-r\">attachment::att_from_rmd(\"01.Rmd\")<\/code><\/pre>\n<\/div>\n<p>One can install these dependencies via the usual <code>install.packages<\/code> method, except for <code>Bioconductor<\/code> packages that would require <code>BiocManager::install<\/code>:<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-r\">deps <- attachment::att_from_rmd(\"01.Rmd\")\ninstall.packages(deps)<\/code><\/pre>\n<\/div>\n<p>or, in an <code>Ubuntu<\/code> environment (as it might be the case in a cloud setup), afaster approach might be to get pre-compiled versions using <code>Ubuntu<\/code>'s package manager:<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-sh\"># replace `PKG1\/2\/3` with the required package names\nsudo apt install r-cran-PKG1 r-cran-PKG2 r-cran-PKG3 ...<\/code><\/pre>\n<\/div>\n<h2>Connecting R to a local GridDB cluster<\/h2>\n<p>As documented <a href=\"https:\/\/griddb.net\/en\/blog\/analysis-of-the-swisslos-lottery-using-r-and-griddb\/\">previously<\/a>, we'll use the <code>RJDBC<\/code> package (Java database connectivity) to connect to the GridDB cluster. As the cluster is running locally (as would be the case if you followed the installation and getting started <a href=\"https:\/\/docs.griddb.net\/latest\/gettingstarted\/using-apt\/#install-with-deb\">guide<\/a>), be sure to use the localhost IP address (<code>127.0.0.1<\/code>) and the port <code>20001<\/code>.<\/p>\n<p>For ease of use down the line, we can wrap these two steps, creating a <code>jdbc<\/code> driver and a GridDB connection, in a function called <code>connect_to_griddb<\/code>. Note that we are loading the entire namespaces of <code>RJDBC<\/code> and <code>DBI<\/code>, which might be OK for an exploratory interactive session, but for more serious projects, we'd of course create an R package and import only the needed functions explicitly with <code>importFrom<\/code> or use an approach such as <a href=\"https:\/\/github.com\/klmr\/box\"><code>box<\/code><\/a> to create modules with specific imports.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-r\">suppressPackageStartupMessages({\n  library(RJDBC)\n  library(DBI)\n  library(dplyr)\n  library(tidyr)\n  library(purrr)\n  library(qs)\n})\nconnect_to_griddb <- function() {\n  drv <- JDBC(driverClass = \"com.toshiba.mwcloud.gs.sql.Driver\",\n              # Point this to your gridstore jar\n              classPath = \"~\/src\/jdbc\/bin\/gridstore-jdbc.jar\")\n  \n  conn <-\n    dbConnect(drv,\n              \"jdbc:gs:\/\/127.0.0.1:20001\/myCluster\/public\",\n              \"admin\",\n              \"admin\")\n  \n  return(conn)\n}<\/code><\/pre>\n<\/div>\n<h2>Data<\/h2>\n<p>We'll work with some gene expression data from human cancer studies as published by The Cancer Genome Atlas (TCGA) <a href=\"https:\/\/www.cancer.gov\/about-nci\/organization\/ccg\/research\/structural-genomics\/tcga\">program<\/a>. In particular, we'll focus on data packaged by the Bioconductor package <a href=\"https:\/\/bioconductor.org\/packages\/devel\/bioc\/vignettes\/GSEABenchmarkeR\/inst\/doc\/GSEABenchmarkeR.html#setup\"><code>GSEABenchmarkeR<\/code><\/a> and intended to use for benchmarking gene-expression studies. The data were downloaded, pre-formatted into rectangular sets and saved as a <a href=\"https:\/\/github.com\/traversc\/qs\"><code>qs<\/code><\/a> archive for faster saving and loading.<\/p>\n<p>A quick peak at the data sets:<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-r\">TCGA <- qs::qread(\"data\/expressions.qs\")\nsapply(TCGA, nrow) |> summary()<\/code><\/pre>\n<\/div>\n<p>We have expression data for 10 cancers, with a mean of 1.2M observations (rows). Each data set has 5 columns,<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-r\">TCGA[[1]] |> head() |> knitr::kable()<\/code><\/pre>\n<\/div>\n<p>and the represented cancers are<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-r\">sapply(TCGA, \"[[\", 1, 5) |> unlist() |> knitr::kable()<\/code><\/pre>\n<\/div>\n<h2>Loading our data into GridDB<\/h2>\n<p>To ingest the data, we first create a connection to the GridDB cluster using our function defined above. Then, we define two functions, one to create a table and another to insert data into the table. These functions will allow us to cycle overa list of data frames and load each data frame into the database seamlessly with base R's <code>*apply<\/code> family of functions or <code>purrr<\/code>'s <code>map_*<\/code> family of functions.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-r\">griddb <- connect_to_griddb()\ncreate_table <- function(table_name) {\n  dbSendUpdate(\n    griddb,\n    sprintf(\n      \"CREATE TABLE IF NOT EXISTS %s (Gene STRING, Sample STRING, Expression INTEGER, Code STRING, Name STRING);\",\n      table_name\n    )\n  )\n}\ninsert_table <- function(conn, name, df, append = TRUE) {\n  for (i in seq_len(nrow(df))) {\n    dbWriteTable(conn, name, df[i, ], append = append)\n  }\n}\n# using `invisible` here to hide non-informative output\ninvisible(lapply(names(TCGA), create_table))\nDBI::dbListTables(griddb)\n# 6 minutes for 10K rows per table. So 6 min for 100K rows\n# 1h for 100K rows per table. So 1h for 1M rows.\ninvisible(pbapply::pblapply(seq_along(TCGA), function(i)\n  insert_table(\n    conn = griddb,\n    name = names(TCGA)[[i]],\n    df = TCGA[[i]][1:100000, ],\n    append = TRUE\n  ))\n)\nDBI::dbDisconnect(griddb)<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-r\">griddb <- connect_to_griddb()\niter <- setNames(names(TCGA), names(TCGA))\nmap_dfr(iter, function(x) {\n  griddb |> tbl(x) |> collect() |> nrow()\n}) |> knitr::kable()<\/code><\/pre>\n<\/div>\n<h2>Data analysis<\/h2>\n<p>Some summary simple stats. Nothing biologically meaningful at this point. The main idea is to use the familiar R packages like <code>dplyr<\/code>, <code>tidyr<\/code> and <code>purrr<\/code> toquery the fast GridDB backend database. The R code below, could well have been developed for a <code>MariaDB<\/code> or <code>PostgresSQL<\/code> database, but it will work equally well with a GridDB setup.<\/p>\n<h3>Genes per cancer dataset that have average expression > 2000<\/h3>\n<div class=\"clipboard\">\n<pre><code class=\"language-r\">high_expression_genes <- purrr::map_dfr(iter, .id = \"Cancer\", \n  function(x) {\n    griddb |>\n      tbl(x) |>\n      group_by(Gene) |>\n      summarise(Mean_expression = mean(Expression, na.rm = TRUE)) |>\n      filter(Mean_expression >= 2000) |> \n      collect()\n  }\n)\ndim(high_expression_genes)\nhead(high_expression_genes) |> knitr::kable()<\/code><\/pre>\n<\/div>\n<h3>Genes per cancer dataset that have average expression &lt; 300<\/h3>\n<div class=\"clipboard\">\n<pre><code class=\"language-r\">low_expression_genes <- purrr::map_dfr(iter, .id = \"Cancer\",\n  function(x) {\n    griddb |>\n      tbl(x) |>\n      group_by(Gene) |>\n      summarise(Mean_expression = mean(Expression, na.rm = TRUE)) |>\n      filter(Mean_expression <= 300) |> \n      collect() \n  }\n)\ndim(low_expression_genes)\ntail(low_expression_genes) |> knitr::kable()<\/code><\/pre>\n<\/div>\n<h3>Genes shared among all cancer data sets<\/h3>\n<div class=\"clipboard\">\n<pre><code class=\"language-r\">unique_genes_per_dataset <- purrr::map(iter, \n  function(x) {\n    griddb |>\n      tbl(x) |>\n      collect() |>\n      pull(Gene) |> \n      unique()\n  }\n)\nshared_genes <- Reduce(intersect, unique_genes_per_dataset)\nshared_genes<\/code><\/pre>\n<\/div>\n<h3>Median expression per cancer data set of the first 10 genes shared for all data sets<\/h3>\n<div class=\"clipboard\">\n<pre><code class=\"language-r\">first10 <- shared_genes[1:10]\niter2 <- expand.grid(iter, first10, stringsAsFactors = FALSE)\nfirst10_expression <-\n  purrr::map2_dfr(iter2$Var1, iter2$Var2, function(.x, .y) {\n    griddb |>\n      tbl(.x) |>\n      filter(Gene == .y) |> \n      collect() \n  })\nfirst10_expression |>\n  group_by(Code, Name, Gene) |>\n  summarise(Mean_expression = mean(Expression, na.rm = TRUE)) |>\n  ungroup() |> \n  select(-Name) |>\n  pivot_wider(names_from = Code, values_from = Mean_expression) |> \n  knitr::kable()<\/code><\/pre>\n<\/div>\n<h2>Clean up<\/h2>\n<div class=\"clipboard\">\n<pre><code class=\"language-r\">DBI::dbDisconnect(griddb)<\/code><\/pre>\n<\/div>\n<h2>Conclusion<\/h2>\n<p>In this vignette, we saw how we can use GridDB database from R. We ingested one million rows of gene expression data in ten tables in about 1 hour and then queried the data seamlessly using familiar tools like <code>DBI<\/code> and <code>dplyr<\/code>.<\/p>\n<h2>Appendix<\/h2>\n<p>R \/ <code>Bioconductor<\/code> code used to download and preprocess the gene expression data.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-r\"># This has a lot of dependencies and can take a few minutes to install\n# BiocManager::install(\"GSEABenchmarkeR\")\nlibrary(GSEABenchmarkeR)\ntcga <- loadEData(\"tcga\", nr.datasets = 10)\ncancer_abbreviations <- names(tcga)\ncancer_names <- c(\"BLCA\" = \"Bladder Urothelial Carcinoma\",\n                  \"BRCA\" = \"Breast Invasive Carcinoma\",\n                  \"COAD\" = \"Colon Adenocarcinoma\",\n                  \"HNSC\" = \"Head and Neck Squamous Cell Carcinoma\",\n                  \"KICH\" = \"Kidney Chromophobe\",\n                  \"KIRC\" = \"Kidney Renal Clear Cell Carcinoma\",\n                  \"KIRP\" = \"Kidney Renal Papillary Cell Carcinoma\",\n                  \"LIHC\" = \"Liver Hepatocellular Carcinoma\",\n                  \"LUAD\" = \"Lung Adenocarcinoma\",\n                  \"LUSC\" = \"Lung Squamous Cell Carcinoma\")\nexpressions <- lapply(seq_along(tcga), function(i) {\n  tcga[[i]]@assays@data@listData |> \n    as.data.frame() |> \n    tibble::rownames_to_column(var = \"Gene\") |> \n    tidyr::pivot_longer(cols = -Gene, names_to = \"Sample\", values_to = \"Expression\") |> \n    mutate(cancer = cancer_abbreviations[[i]], cancer_name = cancer_names[[i]])\n})\nqs::qsave(expressions, \"expressions.qs\")<\/code><\/pre>\n<\/div>\n<h2>Session info<\/h2>\n<div class=\"clipboard\">\n<pre><code class=\"language-r\">sessionInfo()<\/code><\/pre>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Introduction In this document, we&#8217;ll first go over some notes about setting up GridDB and connecting to it from R. Then, we&#8217;ll ingest some gene expression data from The Cancer Genome Atlas (TCGA), and query the GridDB backend using dplyr to generate simple summary statistics. Prerequisites You can follow along by cloning the source code [&hellip;]<\/p>\n","protected":false},"author":41,"featured_media":29458,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[121],"tags":[],"class_list":["post-46753","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.1.1 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Ingest and Query a Gene Expression Dataset in R with GridDB | GridDB: Open Source Time Series Database for IoT<\/title>\n<meta name=\"description\" content=\"Introduction In this document, we&#039;ll first go over some notes about setting up GridDB and connecting to it from R. Then, we&#039;ll ingest some gene expression\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Ingest and Query a Gene Expression Dataset in R with GridDB | GridDB: Open Source Time Series Database for IoT\" \/>\n<meta property=\"og:description\" content=\"Introduction In this document, we&#039;ll first go over some notes about setting up GridDB and connecting to it from R. Then, we&#039;ll ingest some gene expression\" \/>\n<meta property=\"og:url\" content=\"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/\" \/>\n<meta property=\"og:site_name\" content=\"GridDB: Open Source Time Series Database for IoT\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/griddbcommunity\/\" \/>\n<meta property=\"article:published_time\" content=\"2023-03-15T07:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-11-13T20:56:30+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/griddb.net\/wp-content\/uploads\/2023\/03\/Gene_Expression.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1160\" \/>\n\t<meta property=\"og:image:height\" content=\"653\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"griddb-admin\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@GridDBCommunity\" \/>\n<meta name=\"twitter:site\" content=\"@GridDBCommunity\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"griddb-admin\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/\"},\"author\":{\"name\":\"griddb-admin\",\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233\"},\"headline\":\"Ingest and Query a Gene Expression Dataset in R with GridDB\",\"datePublished\":\"2023-03-15T07:00:00+00:00\",\"dateModified\":\"2025-11-13T20:56:30+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/\"},\"wordCount\":653,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/griddb.net\/en\/#organization\"},\"image\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/#primaryimage\"},\"thumbnailUrl\":\"\/wp-content\/uploads\/2023\/03\/Gene_Expression.png\",\"articleSection\":[\"Blog\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/\",\"url\":\"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/\",\"name\":\"Ingest and Query a Gene Expression Dataset in R with GridDB | GridDB: Open Source Time Series Database for IoT\",\"isPartOf\":{\"@id\":\"https:\/\/griddb.net\/en\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/#primaryimage\"},\"thumbnailUrl\":\"\/wp-content\/uploads\/2023\/03\/Gene_Expression.png\",\"datePublished\":\"2023-03-15T07:00:00+00:00\",\"dateModified\":\"2025-11-13T20:56:30+00:00\",\"description\":\"Introduction In this document, we'll first go over some notes about setting up GridDB and connecting to it from R. Then, we'll ingest some gene expression\",\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/#primaryimage\",\"url\":\"\/wp-content\/uploads\/2023\/03\/Gene_Expression.png\",\"contentUrl\":\"\/wp-content\/uploads\/2023\/03\/Gene_Expression.png\",\"width\":1160,\"height\":653},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/griddb.net\/en\/#website\",\"url\":\"https:\/\/griddb.net\/en\/\",\"name\":\"GridDB: Open Source Time Series Database for IoT\",\"description\":\"GridDB is an open source time-series database with the performance of NoSQL and convenience of SQL\",\"publisher\":{\"@id\":\"https:\/\/griddb.net\/en\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/griddb.net\/en\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/griddb.net\/en\/#organization\",\"name\":\"Fixstars\",\"url\":\"https:\/\/griddb.net\/en\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png\",\"contentUrl\":\"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png\",\"width\":200,\"height\":83,\"caption\":\"Fixstars\"},\"image\":{\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/griddbcommunity\/\",\"https:\/\/x.com\/GridDBCommunity\",\"https:\/\/www.linkedin.com\/company\/griddb-by-toshiba\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233\",\"name\":\"griddb-admin\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g\",\"caption\":\"griddb-admin\"},\"url\":\"https:\/\/griddb.net\/en\/author\/griddb-admin\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Ingest and Query a Gene Expression Dataset in R with GridDB | GridDB: Open Source Time Series Database for IoT","description":"Introduction In this document, we'll first go over some notes about setting up GridDB and connecting to it from R. Then, we'll ingest some gene expression","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/","og_locale":"en_US","og_type":"article","og_title":"Ingest and Query a Gene Expression Dataset in R with GridDB | GridDB: Open Source Time Series Database for IoT","og_description":"Introduction In this document, we'll first go over some notes about setting up GridDB and connecting to it from R. Then, we'll ingest some gene expression","og_url":"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/","og_site_name":"GridDB: Open Source Time Series Database for IoT","article_publisher":"https:\/\/www.facebook.com\/griddbcommunity\/","article_published_time":"2023-03-15T07:00:00+00:00","article_modified_time":"2025-11-13T20:56:30+00:00","og_image":[{"width":1160,"height":653,"url":"https:\/\/griddb.net\/wp-content\/uploads\/2023\/03\/Gene_Expression.png","type":"image\/png"}],"author":"griddb-admin","twitter_card":"summary_large_image","twitter_creator":"@GridDBCommunity","twitter_site":"@GridDBCommunity","twitter_misc":{"Written by":"griddb-admin","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/#article","isPartOf":{"@id":"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/"},"author":{"name":"griddb-admin","@id":"https:\/\/griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233"},"headline":"Ingest and Query a Gene Expression Dataset in R with GridDB","datePublished":"2023-03-15T07:00:00+00:00","dateModified":"2025-11-13T20:56:30+00:00","mainEntityOfPage":{"@id":"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/"},"wordCount":653,"commentCount":0,"publisher":{"@id":"https:\/\/griddb.net\/en\/#organization"},"image":{"@id":"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2023\/03\/Gene_Expression.png","articleSection":["Blog"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/","url":"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/","name":"Ingest and Query a Gene Expression Dataset in R with GridDB | GridDB: Open Source Time Series Database for IoT","isPartOf":{"@id":"https:\/\/griddb.net\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/#primaryimage"},"image":{"@id":"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2023\/03\/Gene_Expression.png","datePublished":"2023-03-15T07:00:00+00:00","dateModified":"2025-11-13T20:56:30+00:00","description":"Introduction In this document, we'll first go over some notes about setting up GridDB and connecting to it from R. Then, we'll ingest some gene expression","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/griddb.net\/en\/blog\/ingest-and-query-a-gene-expression-dataset-in-r-with-griddb\/#primaryimage","url":"\/wp-content\/uploads\/2023\/03\/Gene_Expression.png","contentUrl":"\/wp-content\/uploads\/2023\/03\/Gene_Expression.png","width":1160,"height":653},{"@type":"WebSite","@id":"https:\/\/griddb.net\/en\/#website","url":"https:\/\/griddb.net\/en\/","name":"GridDB: Open Source Time Series Database for IoT","description":"GridDB is an open source time-series database with the performance of NoSQL and convenience of SQL","publisher":{"@id":"https:\/\/griddb.net\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/griddb.net\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/griddb.net\/en\/#organization","name":"Fixstars","url":"https:\/\/griddb.net\/en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/griddb.net\/en\/#\/schema\/logo\/image\/","url":"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png","contentUrl":"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png","width":200,"height":83,"caption":"Fixstars"},"image":{"@id":"https:\/\/griddb.net\/en\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/griddbcommunity\/","https:\/\/x.com\/GridDBCommunity","https:\/\/www.linkedin.com\/company\/griddb-by-toshiba"]},{"@type":"Person","@id":"https:\/\/griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233","name":"griddb-admin","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/griddb.net\/en\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g","caption":"griddb-admin"},"url":"https:\/\/griddb.net\/en\/author\/griddb-admin\/"}]}},"_links":{"self":[{"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/posts\/46753","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/users\/41"}],"replies":[{"embeddable":true,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/comments?post=46753"}],"version-history":[{"count":1,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/posts\/46753\/revisions"}],"predecessor-version":[{"id":51420,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/posts\/46753\/revisions\/51420"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/media\/29458"}],"wp:attachment":[{"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/media?parent=46753"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/categories?post=46753"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/griddb.net\/en\/wp-json\/wp\/v2\/tags?post=46753"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}