R/convenience.R
get_differential_expression_values.Rd
Retrieves the differential expression result set(s) associated with the dataset.
To get more information about the contrasts in individual resultSets and
annotation terms associated them, use get_dataset_differential_expression_analyses()
get_differential_expression_values(
dataset = NA_character_,
resultSets = NA_integer_,
readableContrasts = FALSE,
memoised = getOption("gemma.memoised", FALSE)
)
A dataset identifier.
resultSet identifiers. If a dataset is not provided, all result sets will be downloaded. If it is provided it will only be used to ensure all result sets belong to the dataset.
If FALSE
(default), the returned columns will
use internal constrasts IDs as names. Details about the contrasts can be accessed
using get_dataset_differential_expression_analyses
. If TRUE IDs will
be replaced with human readable contrast information.
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing options(gemma.memoised = TRUE)
will ensure that the cache is always
used. Use forget_gemma_memoised
to clear the cache.
A list of data tables with differential expression values per result set.
In Gemma each result set corresponds to the estimated effects associated with a single factor in the design, and each can have multiple contrasts (for each level compared to baseline). Thus a dataset with a 2x3 factorial design will have two result sets, one of which will have one contrast, and one having two contrasts.
The methodology for differential expression is explained in Curation of over 10000 transcriptomic studies to enable data reuse. Briefly, differential expression analysis is performed on the dataset based on the annotated experimental design with up two three potentially nested factors. Gemma attempts to automatically assign baseline conditions for each factor. In the absence of a clear control condition, a baseline is arbitrarily selected. A generalized linear model with empirical Bayes shrinkage of t-statistics is fit to the data for each platform element (probe/gene) using an implementation of the limma algorithm. For RNA-seq data, we use weighted regression, applying the voom algorithm to compute weights from the mean–variance relationship of the data. Contrasts of each condition are then computed compared to the selected baseline. In some situations, Gemma will split the data into subsets for analysis. A typical such situation is when a ‘batch’ factor is present and confounded with another factor, the subsets being determined by the levels of the confounding factor.
get_differential_expression_values("GSE2018")
#> $`573187`
#> Probe NCBIid GeneSymbol
#> <char> <char> <char>
#> 1: 219118_at 51303 FKBP11
#> 2: 203308_x_at 3257 HPS1
#> 3: 210005_at 2618 GART
#> 4: 221791_s_at 51372|112268293 TMA7|TMA7B
#> 5: 202803_s_at 3689 ITGB2
#> ---
#> 22279: 212384_at
#> 22280: 203578_s_at 9057 SLC7A6
#> 22281: 203871_at 26168|100533955 SENP3|SENP3-EIF4A1
#> 22282: 218582_at 54708 MARCHF5
#> 22283: 205896_at 6583 SLC22A4
#> GeneName
#> <char>
#> 1: FKBP prolyl isomerase 11
#> 2: HPS1 biogenesis of lysosomal organelles complex 3 subunit 1
#> 3: phosphoribosylglycinamide formyltransferase, phosphoribosylglycinamide synthetase, phosphoribosylaminoimidazole synthetase
#> 4: translation machinery associated 7 homolog|translation machinery associated 7 homolog B
#> 5: integrin subunit beta 2
#> ---
#> 22279:
#> 22280: solute carrier family 7 member 6
#> 22281: SUMO specific peptidase 3|SENP3-EIF4A1 readthrough (NMD candidate)
#> 22282: membrane associated ring-CH-type finger 5
#> 22283: solute carrier family 22 member 4
#> pvalue corrected_pvalue rank contrast_1_log2fc contrast_1_tstat
#> <num> <num> <num> <num> <num>
#> 1: 6.184e-06 0.0004 0.0154 0.2811 5.3405
#> 2: 2.900e-02 0.1150 0.2521 -0.1410 -2.2800
#> 3: 1.496e-01 0.3291 0.4546 0.1121 1.4742
#> 4: 3.670e-02 0.1336 0.2748 0.1944 2.1744
#> 5: 4.798e-01 0.6676 0.7187 -0.0649 -0.7145
#> ---
#> 22279: 2.310e-02 0.0993 0.2330 0.1177 2.3784
#> 22280: 2.737e-01 0.4766 0.5743 0.0462 1.1125
#> 22281: 4.720e-02 0.1572 0.3004 -0.1823 -2.0590
#> 22282: 7.510e-01 0.8626 0.8706 0.0198 0.3199
#> 22283: 5.120e-02 0.1659 0.3086 0.2398 2.0211
#> contrast_1_pvalue
#> <num>
#> 1: 6.184e-06
#> 2: 2.900e-02
#> 3: 1.496e-01
#> 4: 3.670e-02
#> 5: 4.798e-01
#> ---
#> 22279: 2.310e-02
#> 22280: 2.737e-01
#> 22281: 4.720e-02
#> 22282: 7.510e-01
#> 22283: 5.120e-02
#>