diff --git a/assays/GeneExpressionComparison/README.md b/assays/GeneExpressionComparison/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/assays/GeneExpressionComparison/dataset/.gitkeep b/assays/GeneExpressionComparison/dataset/.gitkeep new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/assays/GeneExpressionComparison/isa.assay.xlsx b/assays/GeneExpressionComparison/isa.assay.xlsx new file mode 100644 index 0000000000000000000000000000000000000000..b66528c2d53360cc64e0d680fe2abbefccaad75c Binary files /dev/null and b/assays/GeneExpressionComparison/isa.assay.xlsx differ diff --git a/assays/GeneExpressionComparison/protocols/.gitkeep b/assays/GeneExpressionComparison/protocols/.gitkeep new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/assays/GeneExpressionComparison/protocols/ComparisonOfGeneExpressionAcrossGeneSets.md b/assays/GeneExpressionComparison/protocols/ComparisonOfGeneExpressionAcrossGeneSets.md new file mode 100644 index 0000000000000000000000000000000000000000..6fb75a50b7ead3e77fc075a91926ae54d03b490a --- /dev/null +++ b/assays/GeneExpressionComparison/protocols/ComparisonOfGeneExpressionAcrossGeneSets.md @@ -0,0 +1,7 @@ +## Comparison of gene expression across gene sets + +To compare the mean relative expression between *R*-genes (*R*-gene set size for tomato = 359 and for potato = 581) and non-*R*-genes (the rest of the genome) we generated 100 replicate datasets for each transcriptome by sampling the TPM values of 359 random genes from tomato and 581 random genes from potato. The average TPM of all expressed genes was calculated for each replicate dataset. To compare expression values, four reference genes were used: ubiquitin (*Solyc09g018730.4.1*) and actin4 (*Solyc04g011500.3.1*) for tomato (Müller et al., 2015) an importin subunit (*PGSC0003DMG400007289*) and elongation factor-1 (*PGSC0003DMG400023270*) for potato (Mariot et al., 2015; Tang et al., 2017). TPM values were tested for normality using the Anderson-Darling (>5000 data points; Thode, 2002) or Shapiro test (< 5000 data points; Shapiro and Wilk, 1965) and for equal variances using the test from Kendall (1938). Significant differences in expression were identified using a Mann-Whitney-U test (Mann and Whitney, 1947) for non-normally distributed data or a two-sample t-test for normally distributed data. + +We visualized *R*-gene expression using heatmaps created in R (v. 3.6.1). Genes were classified as off (if TPM < 1) or on (if TPM ≥1). In the heatmaps, libraries were clustered by similarity in patterns of expression between libraries and *R*-genes were sorted by the number of libraries expressing the corresponding gene. Correlations between 1) the total number of expressed *R*-genes and the total number of expressed genes, 2) the total number of expressed genes and the number of pseudo-aligned reads, as well as 3) the number of libraries in which an *R*-gene was expressed and the average level of expression of each *R*-gene were performed using a Spearman’s rank correlation test (Hollander et al., 2013). + +To investigate the extent to which expression patterns of *R*-genes were similar to wild close relatives of tomatoes, we evaluated additional transcriptomes of four wild tomato species: *S. peruvianum*, *S. chilense*, *S. ochranthum*, and *S. lycopersicoides* (Beddows et al., 2017). A subset of *R*-genes was further analyzed for their patterns of sequence variation within and between these wild species. Standard population genetic parameters including intraspecific variation (π) and interspecific divergence (K) were estimated using DNaSP v. 5.10 (Librado and Rozas, 2009). \ No newline at end of file