diff --git a/README.md b/README.md index 60cfd426318fd6e010c55305471ed2321eb55cc5..28de16e3b13ced0767e7ce3df0badd373bb50a93 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# From Habitat to Genotype: The Complex Interplay of Climate, Phenotypes, and Taxonomy in Teosinte +# TEOSINTE Project: Comparative Genomics of Maize's Wild Relatives (Teosinte) ## Description @@ -24,12 +24,16 @@ This project is funded by the German Ministry for Science and Education (BMBF 03 ## Data and Code Availability -- Genotyping-by-sequencing (GBS) data was provided by Mexican patterns. The data is publiclly available under the following publised papers: +- **Genotyping-by-sequencing (GBS) Data**: The GBS data used in this study was obtained from publicly available sources and is detailed in the following published papers: + - [Ecogeography of Teosinte](https://doi.org/10.1371/journal.pone.0192676) + - [Genomic Diversity and Population Structure of Teosinte (Zea spp.) and Its Conservation Implications](https://doi.org/10.1371/journal.pone.0291944) - - [Ecogeography of Teosinte](https://doi.org/10.1371/journal.pone.0192676) - - [Genomic Diversity and Population Structure of Teosinte (Zea spp.) and Its Conservation Implications](https://doi.org/10.1371/journal.pone.0291944) +- **Repository**: + All code, scripts, and associated files used for the Genome-Wide Association Study (GWAS) are hosted in this repository. -- All code is contained in this git repository. Because the material (data) is contained in large files these cannot be included. Please write to the authors with affiliation to the Forschungszentrum Jülich (FZJ) to obtain the data files. For FZJ researches All data and code is stored on the compute cluster of the Forschungszentrum Jülich, Institute of Bio- and Geosciences (IBG), Bioinformatics (IBG-4). There all code regarding the population genetics analysis is stored in this directory: `/mnt/data/joseph/TEOSINTE` +- **Data and File Tracking**: + - Large datasets and other outputs generated during the study are stored and version-controlled using **Git LFS** (Large File Storage) for efficient tracking of changes and sharing of substantial files within the repository. + - The specific data and outputs related to this GWAS study are included and organized in the repository for direct access. ## Directory Structure @@ -63,6 +67,12 @@ This project is funded by the German Ministry for Science and Education (BMBF 03 --- +## Steps in Analysis + +The sections that follow outlines the key steps taken to explore and analyze the relationships described in the manuscript *From Habitat to Genotype: The Complex Interplay of Climate, Phenotypes, and Taxonomy in Teosinte*. Each step is detailed to ensure reproducibility and transparency in the analysis. + +--- + ## Phenotype and Climate Data Analysis ### Hierarchical Clustering and PCA Analysis diff --git a/runs/README.md b/runs/README.md new file mode 100644 index 0000000000000000000000000000000000000000..d67d5e0573ceacc7a7bbb63afec573b7d0883f68 --- /dev/null +++ b/runs/README.md @@ -0,0 +1,24 @@ +# Directory Overview + +This directory contains results from GWAS on 237 climate variables and 18 morphological traits, tracked using Git LFS due to large file sizes. Analyses employed various mixed linear models, with Bayesian Information Criterion (BIC) used to identify the optimal population structure for each variable or trait. + +--- + +## Structure + +Subdirectories + +Climate/Morphological Variables: + +- Each variable has a dedicated subdirectory containing model outputs and results. +- Morphological Traits: +- Similar structure, with outputs for each trait. + +Contents of Each Subdirectory + +- Summary Files: Key SNPs, p-values, and effect sizes. +- Model Outputs: Results for different mixed linear models. +- Plots: Manhattan and Q-Q plots for visual insights. +- Diagnostics: Residual analyses and BIC comparisons. + +--- diff --git a/studies/initial_data/README.md b/studies/initial_data/README.md new file mode 100644 index 0000000000000000000000000000000000000000..c5b4f9fb9e1cac73a50cfc8406929112890b69f6 --- /dev/null +++ b/studies/initial_data/README.md @@ -0,0 +1,3 @@ +# Contents + +Data and scripts provided by Alicia Mastretta-Yanes to the German team in December 2021. The data includes metadata, SNP data, and admixture output for teosinte. See the [resources]('./resources') directory for further details.