Compile gene expression data and metadata — get_dataset

Return an annotated Bioconductor-compatible data structure or a long form tibble of the queried dataset, including expression data and the experimental design.

get_dataset_object(
  datasets,
  genes = NULL,
  keepNonSpecific = FALSE,
  consolidate = NA_character_,
  resultSets = NULL,
  contrasts = NULL,
  metaType = "text",
  type = "se",
  memoised = getOption("gemma.memoised", FALSE)
)

Arguments

datasets: A vector of dataset IDs or short names
genes: A vector of NCBI IDs, Ensembl IDs or gene symbols.
keepNonSpecific: logical. FALSE by default. If TRUE, results from probesets that are not specific to the gene will also be returned.
consolidate: An option for gene expression level consolidation. If empty, will return every probe for the genes. "pickmax" to pick the probe with the highest expression, "pickvar" to pick the prove with the highest variance and "average" for returning the average expression
resultSets: Result set IDs of the a differential expression analysis. Optional. If provided, the output will only include the samples from the subset used in the result set ID. Must be the same length as datasets.'
contrasts: Contrast IDs of a differential expression contrast. Optional. Need resultSets to be defined to work. If provided, the output will only include samples relevant to the specific contrats.
metaType: How should the metadata information should be included. Can be "text", "uri" or "both". "text" and "uri" options
type: "se"for a SummarizedExperiment or "eset" for Expression Set. We recommend using SummarizedExperiments which are more recent. See the Summarized experiment vignette or the ExpressionSet vignette for more details. "tidy" for a long form data frame compatible with tidyverse functions. 'list' to return a list containing individual data frames containing expression values, design and the experiment.
memoised: Whether or not to save to cache for future calls with the same inputs and use the result saved in cache if a result is already saved. Doing options(gemma.memoised = TRUE) will ensure that the cache is always used. Use forget_gemma_memoised to clear the cache.

Value

A list of SummarizedExperiments, ExpressionSets or a tibble containing metadata and expression data for the queried datasets and genes. Metadata will be expanded to include a variable number of factors that annotates samples from a dataset but will always include single "factorValues" column that houses data.tables that include all annotations for a given sample.

Examples

get_dataset_object("GSE2018")
#> $`1`
#> class: SummarizedExperiment 
#> dim: 18006 34 
#> metadata(8): title abstract ... gemmaSuitabilityScore taxon
#> assays(1): counts
#> rownames(18006): 1007_s_at 1053_at ... AFFX-HUMISGF3A/M97935_MA_at
#>   AFFX-HUMISGF3A/M97935_MB_at
#> rowData names(4): Probe GeneSymbol GeneName NCBIid
#> colnames(34): BAL_47c_A0B1 BAL_47b_A0B1 ... BAL_15a_A0B1 BAL_37_A1B1
#> colData names(3): factorValues block disease
#>