This project integrates the expression profiles from 73 differentiation-affecting knock-out mutants and WT (RC9) in mouse embryonic stem cells. Measurements were taken in two conditions after 24h: 2i medium preserving the naive state of cells, and differentiating medium (N24).
Our goals are to:
The shiny web app integrates multiple visualizations and summary statistics to make all information, from the gene level to network modules / pathways / GO terms, directly available for exploratory analysis.
The main part of the app consists of two different parts.
The content and functions in the different parts of the app are described in the following sections.
Display information on differential expression of single genes.
option | description |
---|---|
gene | Name of the gene |
avgRPKM | Average RPKM across all conditions (RC9 and KOs in 2i and N24) |
p.2i | Adjusted p-value of the changes in 2i (each KO vs. wild type) |
p.N24 | Adjusted p-value of the changes in N24 (each KO vs. wild type) |
logFC.2i | Log-fold-change in 2i |
logFC.N24 | Log-fold-change in N24 |
naive-association | (marker converse R2) fraction of variance in the gene’s expression that is explained by naive marker expression. -> How tightly associated is the expression of this gene to the core pluripotency markers (Nanog, Esrrb, Tbx3, Tfcp2l1, Klf4, Prdm14, Zfp42)? Note: this measure is static for a given gene, independent of selected samples. |
zX.N24* | Z-value (or mean of z-values) of this gene’s expression across all knockouts in N24. A high magnitude z-value indicates that the respective gene shows extreme behavior in the selected sample(s) compared to all other samples. |
zX.2i* | Z-value (or mean of z-values) of this gene’s expression across all knockouts in 2i. A high magnitude z-value indicates that the respective gene shows extreme behavior in the selected sample(s) compared to all other samples. |
*only visible if KO samples have been selected and committed.
Important note:
If multiple knockout samples are selected and committed, all p-values, log-fold-changes and z-scores are calculated as the mean of selected samples.
Filters may be set on all columns of the table. For numeric fields, a range of values can be set in the header field by slider, or by manual input, e.g.: “0…0.1”
Presets can be set via check-boxes above the table. These are: Show only genes significant (adj. p-value <= 0.1) for N24 changes & 2i changes Show genes tightly associated with marker genes, i.e. require naive-association >= 0.65
If no gene is selected, the avg. change of naive marker genes in WT is mapped to the sample T-sne map. If a gene is selected in the table, additional attributes may be mapped:
option | description |
---|---|
avg. naive marger changes (N24) | Average Log-fold-change of naive marker genes vs wild type in N24 for every KO |
N24 vs 2i (logFC) | Log-fold-change between 2i and N24 conditionsin wild type |
N24 vs RC9/N24 (logFC) | Log-fold-change KO vs wild type in N24 |
2i vs. RC9/2i type | Log-fold-change KO vs wild type in 2i |
If a gene has been selected in the table, a plot of log-expression values of this gene in all committed samples and the wild type is shown below. Additionally, the full annotation associated to this gene as used in the analysis (GO, reactome) is available in two collapsible panes.
We carried out cluster analysis on specific subsets of the N24 knockout response. Results of the cluster analysis are accessible in the “Pre-computed clusters” tab. The clustering, encompassing the constitutive knockout response clusters and inducedN24 clusters, can be selected by drop down menu (“Select a cluster”). This will initialize a heatmap showing the mean of KON24 vs. RC9N24 log-fold-changes for each cluster and knockout, along with the corresponding naive marker log-fold-changes (”Heatmap, mean logFC of clusters”). A cluster is then selected for further inspection.
In this tab you can analyze genes of interest by defining a custom geneset. The left column has the input field where gene names (mgi symbols) of can be added. Genes are separated by new lines otherwise they are not recognized as multiple genes. The right column contains the ‘Map custom genes’ button which checks for the occurrence of the custom genes in the KO data and the time course data. Direct feedback of mapped genes and genes that were not found shows up in the right column after clicking the ‘Map custom genes’ button.
The mapped genes are visualized in three different panes that open upon clicking on their headers:
After opening this pane a heatmap consisting of three sections will be shown.
The heatmap contains log2FCs between each KO (columns) and RC9. Depending on which of the fields in the top left corner is selected the log2FCs show the comparison of KO vs RC9 either at N24 or 2i. A tick box on the top either orders the KOs by naive marker expression (if selected) or by clustering of mapped genes over the KOs. A slider at the bottom provides the option to adjust the scaling of the color space.
The Heatmap in this pane shows the changes of expression over time in relation to 2i (naive state). The order of the columns (time points) can not be changed and as they are ordered by time. The selection fields at the top change how the expression over time is visualized. Here you can either select log2FCs, scaled log2FCs, TPMs, scaled TPMs or log10 TPMs. A slider at the bottom again gives the option to adjust scaling of the shown color space.
The last pane contains a visualization of the original data points and the results after applying Gaussian process regression for the time course analysis. Each gene is shown in one plot and the plots are positioned on a grid. The number of columns in the grid can be adjusted by the slider on the top (1 to 5 columns). Each plot shows TPMs of original measurements (black dots) and the results from the Gaussian process regression (red line).
Note: If more than 100 custom genes were mapped in the first place this plot is not shown. If you want to plot more than 100 genes please split up the data in corresponding batches and repeat the plot.
The right column of the app contains additional panes for selection and visualization of data.
The header area field will show the current selection of knockout samples, if any.
The Knock-out sample t-sne pane serves as the central area to select single knockout samples or clusters of knockouts on which summary statistics (both gene-level and geneset-level) are calculated.
T-sne is a state-of-the-art visualization technique for high-dimensional data. It allows the placement of complex gene expression profiles in 2D, similar to PCA. However, in contrast to PCA, relative distances between points do not have a trivial interpretation and should be ignored. Instead, t-sne creates a non-linear embedding of the neighborhood of each point (expression profile), i.e. it preserves neighborhoods of similar knockout gene expression profiles.
At present, there are four subsets of genes that the t-sne projection may be calculated from:
There are certain actions that trigger interaction between different panes or different fields in the app. Some of those actions are already described in previous sections but will be mentioned here as well. The connection between different tabs and panels helps to analyze specific genes or KOs.
values to map to t-sne:
This option is found on the “Genes” tab and allows the user to change the colors mapped to the samples in the t-sne plot. The standard option selected is the average change of naive marker genes. But there are three alternative options when a gene is selected in the main table of the “Genes” tab. Here the user can chose to either visualize the WT change of this gene in the different samples (N24 vs 2i) or the change between the corresponding KOs ant the WT (either in 2i or N24).
selecting a gene:
Selecting a gene from the main table in the “Genes” tab will allow to select different options for colors n the t-sne plot. Additionally a plot in the left bottom of the “Genes” tab shows the change of this gene in WT and commited KOs. The panes “GO annotations” and “Reactome pathways” will show GO Terms and Reactome pathways that include the corresponding gene.
commit selection:
This button is used to commit a selection of KOs from the t-sne plot. This will have an effect on different panels:
go to cluster genes:
This button in the “Pre-computed clusters” tab will take the user back to the “Genes” tab and show a table that only contains the genes from the previously chosen cluster.
go to custom genes:
Takes the user to the “Genes” tab and shows a table that only contains mapped custom genes.