NIH Logo


      Cancer type: TCGA-LGG

Participants

All

Experiments

mRNA Expression

Illumina

Protein Expression

RPPA

Participants

All

Experiments

Expression


Illumina

Type Data Source Platform Cancer # Samples Level
Expression-Genes TCGA Illumina HiSeq LGG 529 3
Expression-miRNA TCGA Illumina HiSeq LGG 530 3
Expression-Proteins TCPA RPPA LGG 427 4

  1. Overview
  2. Genes
  3. miRNAs
  4. Frequently Asked Questions


  1. Overview
    Both the RNA-seq and miRNA-seq data for Low Grade Glioma (LGG) were downloaded from TCGA via GDC data portal and the PANCAN32 L4 RPPA (Reverse phase protein array) protein expression data from TCPA (The Cancer Proteome Atlas) portal. Data access and functionality are organized by data types: Genes, miRNAs.
  2. Genes
    The Genes module gives access to gene-centric explorations of experimental data including protein expression data. This module is the default entry to the portal.



    It has three main panels: the gene search panel, the participant panel, the experiment panel. The gene search panel presents options to search genes by identifiers, either gene symbol or gene entrez id. The experiment panel displays what kind of experimental data are presently available for visualization. For now these consist of gene expression and protein expression data. Gene expression data came from TCGA on illumina platforms and protein expression data from TCPA on RPPA platforms. Once options have been selected for each of the panels, pressing the "Search" button initiates a search and displays the search results.
    • Searching genes by identifier
      The identifiers that can be used to search genes are "Symbol," which is the official HUGO symbol of the gene, or any known synonyms, "ID",
      Gene ID: 1956, 4163
      Symbol: EGFR, MCC
      Lists of terms can be entered either manually in the text area, or uploaded from a text file. The terms in a list can be separated by any space character, commas, or semicolons.
    • Displaying a summary of experimental data associated with selected genes
      Once the gene search term list has been entered, and the desired expression data options have been selected, pressing the "Search" button displays a summary of the experimental data from both gene and protein associated with the genes of interest. The following screen shows the results obtained from filling the search form as shown in Genes above.



      First, the table shows a summary of the gene expression data followed by protein data in similar format, including: the gene ID, official symbol, the chosen platform, the number of samples arrayed, antibody for protein result and a quick overview of the expression distribution characteristics. For example, for gene EGFR, the mean of the distribution is 54.97, and the standard deviation is 123.00. To get an idea of how close to normal the distribution is, the means of the two tails (areas outside the IQR range of [-0.675sd,0.675sd]) and their size (the numbers of samples in the Down and Up range) are also displayed.
      Second, the table contains links to external gene information from the NCBI Gene database, which can be useful for obtaining more information about the gene, as well as disambiguating in cases when you search by gene synonym and you do not recognize the search term in the displayed results.
      Third, the table displays the visualization options available to you at this point. These are Displaying gene expression distribution plots and Performing single-gene survival analysis, both of which are accessed by pressing the image under the header "Plots" for the gene of interest. When the results indicate availability of data for two or more genes (as is the case in the example above), visualization options also include Displaying heatmap clustering of gene expression data for selected genes, which is accessed by pressing the heatmap image for the platform of choice under the table. Forth, the last column of the table gives option to show results of correlatuon analysis when click on the image under the header "Correlation Analysis" for the gene of interest.
    • Displaying gene expression distribution plots
      The table for Displaying a summary of experimental data associated with selected genes contains a column called "Plots", which lists for each gene links for visualization of data associated with the gene and platform in question. Pressing the link associated with "EGFR" brings up a new window whose upper half displays the following graphs:




      These graphs capture the distribution characteristics of the EGFR gene expressions over the available samples. The gene expressions have been z-scored for each sample over the entire gene panel. The table on the right show the p-values for differences in gene expressions calculated using a non-parametric Mann-Whitney test of the gene in question between the subtypes for the selected stratification. The differences in mean and 95% CI has been included for each subtypes. The numbered subtype mean can be associated with correct subtype by the order in stratification column.
      The panel at the bottom displays options for stratification of samples. The default is stratification by IDH mutation status(IDHwt or IDHmut). However, other stratification as shown in the panel are available when selected. The patient stratification is based on the annotation reported by Verhaak et al., 2016. Furthermore, one may choose to stratify the samples according to increasing expression levels. There are four options available: "1Half vs. 2Half", "1Qt vs. 4Qt", "1Qt vs. 2-4Qt" and "1-3Qt vs. 4Qt". To rerun the analysis with new parameters, make the desired choices and press the "Remodel" button. See below the profile of EGFR after remodeling with different stratifications where samples are stratifed based on Histologic type (Astrocytoma/Oligoastrocytoma/Oligodendroglioma) and "1Qt vs. 4Qt" according to increasing expression levels:




    • Performing single-gene survival analysis
      The table for Displaying a summary of experimental data associated with selected genes contains a column called "Plots", which lists for each gene links for visualization of data associated with the gene in question. Pressing the link associated with "EGFR" brings up a new window whose lower half displays survival analysis graphs:



      The row of graphs contains 3 plots: the first is a Kaplan-Meier survival curve comparison for samples classified according to their subtype. Note that this graph is not gene-specific. The next two plots show survival analysis for each subtype group (IDHwt or IDHmut). In each case the samples are further stratified according to gene expression levels. The default options display analysis for stratification of samples into two groups: those with (EGFR) expression levels smaller than the median over the subgroup, and those with higher than median expression levels. A p-value for the significance of difference between the two resulting curves is also displayed.
      The survival analysis can be customized according to the options shown in the form. So first, one can choose to perform the Kaplan-Meier analysis by various stratefications avaiable on the page, Furthermore, one may choose to stratify the samples not just down the median of expressions, but say to compare samples with extreme expression of the gene of interest (for example, the lowest quartile, versus the highest quartile). To rerun the analysis with new parameters, make the desired choices and press the "Remodel" button.
    • Displaying heatmap clustering of gene expression data for selected genes
      For gene searches that result in multiple hits, an option to display a heatmap -- clustering of the gene expressions according to their similarity, is made available. This is presented in the form of a heatmap icon following the table Displaying a summary of experimental data associated with selected genes. Clicking the icon in our example, displays the following screen.



      The samples (columns of the heatmap) are annotated in three ways: first, according to cluster membership (the optimal number of clusters was determined using NbClust); second, according to subtype; and third, by inspecting the status of a prognostic index (which was computed by weight averaging the gene expressions with the regression coefficients of a multi-gene Cox proportional hazards model). The gene names are annotated with their respective Hazard Ratios in a multi-gene Cox proportional hazards model. The genes that were statistically significant (p <= 0.05) in that model are annotated with **, whereas the genes that are borderline significant (p <= 0.1) are annotated with *.
      When search results involve more than 50 genes, we filter the genes by keeping the 50 genes whose expressions are most varied among the samples.
    • Performing multi-gene survival analysis
      For gene searches that result in multiple hits, in addition to Displaying heatmap clustering we analyze how the expression profiles impact survival. We perform two types of survival analyses: first, we stratify the samples according to the heatmap cluster membership (see the first annotation bar), where the optimal number of clusters is picked out algorithmically. We use a Kaplan-Meier model to analyze the differences in survival.


      We also stratify the samples according to subtypes as well as length of survival. The row of graphs contains 3 plots: the first is a Kaplan-Meier survival curve comparison for samples classified according to their subtype. Note that this graph is not gene-specific. The next two plots show survival analysis for each subtype group (IDHwt or IDHmut). In each case the samples are further stratified according to gene expression levels. The default options display analysis for stratification of samples into two groups: those with (EGFR) expression levels smaller than the median over the subgroup, and those with higher than median expression levels. A p-value for the significance of difference between the two resulting curves is also displayed.
      The survival analysis can be customized according to the options shown in the form. So first, one can choose to perform the Kaplan-Meier analysis by various stratefications avaiable on the page, Furthermore, one may choose to stratify the samples not just down the median of expressions, but say to compare samples with extreme expression of the gene of interest (for example, the lowest quartile, versus the highest quartile). To rerun the analysis with new parameters, make the desired choices and press the "Remodel" button.


    • Displaying a summary of correlation analysis with selected genes
      The table for Displaying a summary of experimental data associated with selected genes contains a column called "Correlation Analysis", which lists for each gene links for a summary of genes correlated with the gene of interest. Pressing the link associated with "EGFR", the following screen shows the table with results obtained from correlation analysis:



      First, the table shows a summary of the gene correlation data, including: the input gene symbol, the correlated gene symbol, the correlation coefficeint and correlation P value
      Second, the table contains links to external gene information from the NCBI Gene database, which can be useful for obtaining more information about the gene, as well as disambiguating in cases when you search by gene synonym and you do not recognize the search term in the displayed results.
      Third, an option to display a heatmap of the correlation among correlated genes is made available. This is presented in the form of a heatmap icon. Clicking the icon in our example, displays the following screen as in Displaying heatmap clustering of gene expression data for selected genes
  3. miRNAs
    The miRNAs module gives access to miRNA-centric explorations of experimental and computational data. This module is accessed by selecting the "MIRNAS" tab of the interface.


    It is analogous to the Genes module, and similarly has three main panels: the miRNA search panel, the participant panel, the experiments panel. The miRNA search panel presents options to search miRNAs by identifiers, either Symbol or Accession id. The experiments panel displays what kind of experimental data are presently available for visualization. For now these consists of miRNA expression data from one microarray study.
    Once options have been selected for each of the panels -- by default there are options selected in all -- pressing the "Search" button initiates a search and displays the search results.
  4. Frequently Asked Questions