add Quantify metabolic activities using PAS assay and protocol

2382e65e · Viktoria Petrova · 2ffcd20c · 2382e65e · 2382e65e · 2382e65e
Commit 2382e65e authored 7 months ago by Viktoria Petrova
--- a/assays/QuantifyMetabolicActivitiesUsingPAS/README.md
+++ b/assays/QuantifyMetabolicActivitiesUsingPAS/README.md
--- a/assays/QuantifyMetabolicActivitiesUsingPAS/dataset/.gitkeep
+++ b/assays/QuantifyMetabolicActivitiesUsingPAS/dataset/.gitkeep
--- a/assays/QuantifyMetabolicActivitiesUsingPAS/isa.assay.xlsx
+++ b/assays/QuantifyMetabolicActivitiesUsingPAS/isa.assay.xlsx
--- a/assays/QuantifyMetabolicActivitiesUsingPAS/protocols/.gitkeep
+++ b/assays/QuantifyMetabolicActivitiesUsingPAS/protocols/.gitkeep
--- a/assays/QuantifyMetabolicActivitiesUsingPAS/protocols/QuantifyMetabolicActivitiesUsingPASProtocol.md
+++ b/assays/QuantifyMetabolicActivitiesUsingPAS/protocols/QuantifyMetabolicActivitiesUsingPASProtocol.md
+## Quantify metabolic activities using PAS
+
+Pathway activity score (PAS) was introduced to quantify the activity of different metabolic pathways in single-cell transcriptomes (Xiao, Z. et al., 2019) (Kim, J.-Y. et al., 2021). It is designed with a permutation test along with a P-value to examine whether the gene expression of a pathway at a particular cell cluster is significantly higher or lower than the sample average. Since we are working with bulk RNA-seq and not scRNA-seq in the current study, this algorithm is modified, and the permutation test is no longer suitable and discarded.
+
+There are 3 sources for pathways used in our analysis. (A) we obtained the pathway data of Arabidopsis from PlantCyc (Hawkins, C. et al., 2021) (link to the tables https://pmn.plantcyc.org/organism-summary?object=ARA, downloaded on 18-NOV-2022). Pathways with patterns ‘glucos’, ‘galactos’, ‘fructos’, ‘xylos’, ‘sucros’, or ‘maltos’ in their name are considered relevant to sugar metabolism. (B) we are also interested in the potential to convert sucrose coming from the phloem into different sugars for secretion at different root segments, and these pathways are not clearly defined in PlantCyc. Thus, we applied flux balanced analysis to define the pathways of relevant metabolic reactions. See section ‘Flux balance analysis (FBA) to define pathways’ for the details of the algorithm. Table S6 shows the genes involved in each pathway. (C) we also assign the SWEET and SUC genes to their own single-gene-pathway to facilitate our analysis. This means the gene SWEET1 is assigned to a new pathway ‘SWEET1’, and so on.
+
+Brady et al. performed bulk transcriptomics on different segments of Arabidopsis root (Brady, S.M. et al., 2007). The experiment involved two replicate roots. In the published dataset, the 13 segments of root 1 are labeled as: *LCOLUMELLASB, L1SB, L2SB, …, L11SB, L12SB*; the 12 segments of root 2 are labeled as: *Slice1JW, Slice2JW, …, Slice11JW, Slice12JW*. We found that the two roots are in different developmental stages, therefore we dropped root 2 and used only root 1 in our analysis.
+
+Given the matrix of gene expression across different samples, we normalized the data using trimmed mean of M values (TMM) normalization (Robinson, M.D. and Oshlack, A., 2010). In practice, this is implemented by the function *calcNormFactors* within the R package *edgeR* (Robinson, M.D., 2010). We used the function argument *method="TMM"* to call TMM and set *logratioTrim=0.3*.
+
+Let us denote gi,j to be the normalized read count of gene i in sample j. The read count of a gene is normalized to give the relative transcript level, which is 1 when averaged over different samples. Mathematically, the relative transcript level of gene i at sample j is denoted as ri,j, and is defined as 𝑟𝑖,𝑗=𝑔𝑖,𝑗/(1/𝑁*⁢∑𝑘𝑔𝑖,𝑘), where N is the total number of samples, and the label k goes over all samples. The PAS of pathway t at sample j is denoted as pt,j, which is a weighted average of the relative transcript levels across the genes of the pathway: 𝑝𝑡,𝑗=∑𝑚𝑡
+𝑖=1𝑤𝑖⁢𝑟𝑖,𝑗/∑𝑚𝑡
+𝑖=1𝑤𝑖. Here mt is the number of genes in pathway t, and wi is the weight of gene i, defined as the reciprocal of the number of pathways that gene i is involved in. Because ri,j is centered around 1, so do pt,j. Thus, if pt,j>1, the expression of genes associated with pathway t in sample j is higher than the average over all samples, and vice versa.
\ No newline at end of file