From cf65048747c434351f3b44493ba3fce307d1bdc4 Mon Sep 17 00:00:00 2001 From: Viktoria Petrova <vipet103@hhu.de> Date: Mon, 20 Jan 2025 11:43:56 +0100 Subject: [PATCH] add data study and protocols --- isa.investigation.xlsx | Bin 2507 -> 7441 bytes studies/Data/README.md | 0 studies/Data/isa.study.xlsx | Bin 0 -> 7491 bytes studies/Data/protocols/.gitkeep | 0 .../protocols/DataOverviewAndPreprocessing.md | 5 +++++ .../protocols/FilteringFlaggedSequences.md | 3 +++ studies/Data/resources/.gitkeep | 0 7 files changed, 8 insertions(+) create mode 100644 studies/Data/README.md create mode 100644 studies/Data/isa.study.xlsx create mode 100644 studies/Data/protocols/.gitkeep create mode 100644 studies/Data/protocols/DataOverviewAndPreprocessing.md create mode 100644 studies/Data/protocols/FilteringFlaggedSequences.md create mode 100644 studies/Data/resources/.gitkeep diff --git a/isa.investigation.xlsx b/isa.investigation.xlsx index f0b490a88535575f2a3a8668373309b9575e80a0..cad7afb6cf4d6a73b0a960fe2627d7f4925c598e 100644 GIT binary patch literal 7441 zcmai3bzGEd(_W;z7NonokrG9eWl3q45@A8QI|NDTMp_V%M(J*l?na~=q#Na1<)G*F zocDY0Klb<hcIUchc4qFGd9GVg1|9(o005u>c2xKvkuy!Gns5NX9s&S>5BpZ@k(H%A z)Y4v0#l;$Gr^DfFVO|&`XW7Dq<990dc$poHp`)VuxFw^khcNdDFybu1MmBTsrg6pz zk13CZD(@LSYl7geTWTAzLPe<bI1xxBDfp=X5OsQ(+2U^XS#o1`I88c=nzs$1-N#3G z(TkR%b1s>_N$e{bVMF55($G*$#Z^&AkqQSN&ba_vW`Hd!ddz9-f)86t=nDf9E$pWu z?=_mc+m@j{;inJpkl=M4>Q0u}!hIptkyx(L);bP|qCg>bawN$se-DT5D=Phhbqy|{ z!6{XC5G|V6_Y4U$e|fw!6upOYEp7@LjnAy3x`7wVs*YJ{xD5X)gM?dvVQahq&I0+{ zmAq_;))PT2+oRns+;fkm@2(!c{n^MM8{-H|v6KF?aP3DAj&b2BLz0lsc03lMb@ZfH zzMFVqD)(E&WR=c~?&D&fUF&>LZmtkHHut68?0F}KH+8Z3@wo`A-zI{Vqj<q0sp4Oe z0Q`&u%=y0-66{IO7HV$Cc`I`ht%Ee_=?&O#GIsy~^j|U@KO_sIn&n%$Fdpo>KVH5} zv;?7~XVp?=&NA@bZ=$m#Ef0Oy=JR#;b;JXVP!r@Zo9#8@o~=I7(9KY#6EY08d=emH zPk%svQ_pu38*CPIqySoE9_vhuq*bGDvb5mbuKI9I0#GNCYvkEmhYB&dH>_d6ZiOBb z3Fd*(jX=kyu%o=9UHgkJuT|gNiv9=FHS{^(xQF@8Ew!vRB6svNsEi?3-AsnE<gdM5 zMt76_)mPJ56bILI)Fx`L#NMZZ2GieGB)`B_&Xr&6fiyah&FG-64TQ)@R9;t;CHlRh zb<#vNxZ)XR6dZI1+<N4nUW12u#M%6p%l;{VdL!^}p7o{ssP@CN)iBSJ{9Vk+%GT_; zm6h3#lW^;3(^f7V$=z>gA5S5U5HNG39kaK#)+xMiS%s{*^b0|YZ>OrHATpG6<g~Ve z$%Id!t-jNHfeNyt$i;72z&Pu2AL@63?tC+s#yD-$T4D&{Xbg0dqfT3bWPEGnwjOkm za-z)Ez59>BJXlh4j|0~a<w%EfgnCB1a{6?gnAD-?bR(bPH_;V7mZA;!t5GE~FTSjM zG;@Na=OYxdeh&7fOZD&NBuNoAM3mAoq>PwYQkUtl`_gWAktT|juZ4jftTt~iaoNu- zzE}79{EV=7DT+&vdZ&x}fOwhj;&I=nTw<5~rA^BJ?r5wV2mNr3-3utx{?9a}6q?os z!J=^w8vr2shwM)X_YZ%@#_L!maRL2zPg!wJF(=*WB&G9;G$*8!_$*Q!ckrS$5JN(u zi!P5vsAcT8pO-g0Xz^V5D0F|fSy%hoWOKermD8{mql*p;{!5%K?eq@M)v>h`Ke~@@ zsbj0@)B*IY<#NyUSbv{wOu2DM8yWE85?VD<lO*+2@2hj`#p`WnbTGYYD<9vOz@|w1 zl#z38E-lF?mr<@Ef5sN=`T**DaEv>9k4LMP$15gXUd)kQr;M$EvTS$kBK=L1Ii1*& z4ZuE=X}TWL3nIRitDcJ3AhGeish6PcZeF(+buEG!Dkbr;serH7l6o)ga0zfD)Ph63 zIPX0XQ>=sd3mST+MQ}mb{R4LWr9IV~{ngF&wkQwIQ-gQpTS=h$RHrqB?AP7cbx0<} z9Y%A71py8E*@t~SiKp~!X2eEQ&_MnX^K!)lA+Jt+jg>I+iOAlrPPctuBd(2uvu#XK zfR0!hcD^$%xms|JFS_DgisOZ+ZL~{uZ2|FA6y$4J_v?WBQ8v1(tvxuL<ZCPU>&W&? zY`y^Z*R|@H7gz3ndF#<OZSP)?FrdNPmZWK*Ur0omzxP_%`>;2%itfZ#A|x{GGkqH) z6+&&WA}d|?Cg{oH%Khh)1CilvJzfYR5%-j$GIjNv8R#6<67y&|rC0f>9GO2VkEoD$ zN!u2~eGD}*$9~gjfy8qRp)-eWcF!i<#S@7W26OHour+6Ah55eM&CZ%E8;#rpCOMuk z;LSuCjZY`!#nx743-gba=R!=*x>V4W>v>1px^o7y!Lr9+RLEtdbJZiP)1~o5K>mr- z4|zs=v7*vf9V0LtOp&`3jbRVvAAEerM#~8fSW<xT{i6%u&2o?Qs(aZ*Sx3{&iU#OQ zwpkMl{)>0wYqU317nq8ybC(1q>siR!2{`)5LB(iNHhwbX*F|mlE%a3_S-i-@W&>&S zt`XDXZ&6a1gez9&-b@EoV4Lb<QfZ(#2-jo4Dnu)};n2HLh``!77M6@Qxz;luFxbOV zlJToT(WwqqtQRVlaRYQxI*S0GMA!d&Bc8A8Eb8Rq2o3uIGrJc$vNn4+ZhC;c=jY>x zZ7k`O%T^zUY8OQL_&5WKmM}7U(w&a6Z%^t?lfZ0gvnqp?u{jg~0C;oy>|W^GLJgJe zZA~nV?S7oR6t%vvX)c_GQ|5wXef)}Rf)8Nog256m&}5zzBXJ=8r|b%!oIJ2c;C*>3 zkLdf-0lr0(bfg)n)<z8p&a0P364$E{wk60}2|)OKrxgtZNjd|D3V1Ve*_vIp3`_#< z6{V)83$*DP;)ZoI3lx7eu@86#j+EwFQmC0KdR^owN9LK}+-9lr@?&|pP=`Culq~f` z`bkByWHsob61d>E90q6-p6T%rrC@~dZ<`5hCoMduU^5Dy8EDfhP^S$qsC!*`uu5BO zm83s+{}tLZ01JIXuL^>pyz+|!1tO~>aUiD2tB{9MK|8XT=E5)cuAaGWA&s=xEdt#J z{Vg`Qqht+m{4I!bc)o_cCUQQVuJ#;Drg@eP=~a4|42Itv2v3WQf;b&x(I6#K5pV|| zfU2qI9Fx%w$qRV`@C`{2BNTBLd|t6H+8<sfk?bR^8|!zM^&8~mYM(PZC-@~lHbf-L z@t^jZ9#5`JWt*Th9(n3{GE?_;-yNGfA#5ff<n9Ph)tdHk@+s#|gfz+Sjt8jO#m&3> zFvijdFSj$qM42-bD*=#^kS=V@t_ZS<u@A(iZckQOHWCM)?dpLn0t#ZiKUWC%m%b9? zmAVQs3c`zTh3j+f#%;1rc5%YU39ouYkpn11pJPM`pVCI46*QFoYIjH!%Z4~DH1as3 z_GD~4=**?cDyNHMy(Eb7{Ho?Df*;zNCW#tyu}<6wVUxuuTQlxgPA8yt6y;aas|0GR z^4k1obtPeF9X+d39X(-o)2t1QzVY=!D^P=n&iO42vSP|p@BCR+uas@3Y*cKf)S|4X zJd~`a+_UF{^2-B*@9YwDU^MSFTt%@31F*%t2qerRdl1Cyc4}m?3#~0f9JU-!?FC6Y zi0kWS70zOp2RZD*u8JcI+9ojHg<dQAzl6KZ^8eb`?O#AGpugJx&0&b@=VuIojZ<w@ zH}frQKDZJ4`T4U8?Pw}k%?JRmSzdiDGSoUqduyrR<2WK<@~UHJkd}6Y&;ZN_VtM@( zNHjgd0gq(g$^q{<qfeyv3{h>n1RMWt`?yD-nDlqrX_cf3lOcT@KvJgX?*6A!3QE1U z{0W6erep>jD8Ue;^B@b7dk0eyl1p(eR)d`UvX4MOOk60gWdbi-@76?720QUQ4k;D+ zYxYqdc;@$7arEv)PZrUV7Ew||iLfJmIViOrg~>lK0~H2YOcdZmcJk2xUk%p}1)fb3 z(XsSp9Z>icgMtj3-d{B-828F#PA+67Uj-)YF$sp@S|o4{M5JKs<?>+$`T8QWkI0K> zJM83@12ak<2ry14h}%_HH(4u5ubiD(Gq2w(Qa0l}R_FPaRHa+5zBLwip#(Va&@j<_ zSE{?8EF^&3ok%h}>skp>+Y|aWhUq>d_Z{f&;O8W;_pPC_L?0`Le-^WD#>BBE&(q1T z*kXRmv36^ez#R>Z=`{oCi<v^z`gYU!2;6rw5a$txi{oyx1_{mT_3mX_O`sXI@bfxs zuhD8yRA)pfGX*uJHATj91inE)2s}z1bI=<v{o3cJJ1p*uFXgAfVnG#L-O|R78YYRr zXGAS)hAdjE?1QGORGfNs_uQ_ow)c096;hMMzB}AM&HgM9)ZtP+A?|TFfUQUJ^7>?H z>;2s3K27+`;~w?m?`MZ{n^_WDwZ_G%;%?{PCSHEKQh~R!vwstS4GQ$LK~5ub(nLWY zmG#0tua(BQBMC$W0+islyB@%ONubVab@*V~a!JPHhviIrU#W@$K{QShzK+}%%{rcY zT3@?1d8SMIe5GwjLtH5Qu&Ne?d~5Y&oa}S(dz=Hw_rvHPB<#{&RSpqJX^?dB3N&lb zo_OK37uB|{$$p6BqriFR6F5!>Z_9KnF++#Y*lUl_JQif(xJyd@#9_1cOyhA&#RHZ1 zc*^Y|f{JWUbD$v-iBeywh}S_By;a3T+MvB7B4wi^e+`tBUWznkd#gwOL$x09{SpLS z^a<+`B7q-8OL`b|y;^ePcwfH~J;{FaAdto&O`07w`)fodw&#PV>9xD~Y;zV`b3PcE z*b&Rpy$)J;cpb$1*HQFS4g|1cx2IPnNy$^33w%$oI)Eu}sXIFAlY~W+c*Hie;0$9e z3QGIfGdoO&!#{ZOFz1@_h0#l(PQi;BA-X#no4q#n&%oKI>2qQ*?L3-~8Oa_N+f;~A z0XvtN9Mhdh2KHw;h(7>p(0inZPY{QHaR~SB9dhfOcZ&0F^R%47z-hnmBBo;Evfm(- zd|nSKNm2T!!78XExYLa}4ynhFS}$xLfJ`lUCYG$dBM`l-w+YOOIll8A<&=`mtcEG< zggcooUbPRZdib)yH0b02%5JCGqdm@(RuHEd%c=IAT_KO2&3%QJCWfMuxLt&X-5p{3 zn?5Ddh&)S~dn+I2TcI@h=1?K<43hByI*pGHKu$l=R=8E8S2f&^8+r%KSgPm*acy7V zAT@foa<G`ifI?MG5-@DWjpdiv`^4;cY^FBrqD+v`<&I_E(G{I-bS_;U3EuJMIVff) z`1#Q2hZiw(Z7tm{h**s9^j57pj0Ok`NYrAoS210jb#4o(96Z@I+d)Pya%(N3g%O>8 zNno10{%~XzTdjAuw~Ul`H-qAy)bDe<TUe|jk>rXkA)IKBA@bi(tmwtP8bRo@M!S%V z%Ecj<VjqdOd(^04lweC;3ql}f^N1-Fr^B=JiNbj<wu?kkSQX3aaCt~SQ6?_lqq!yD z@ecO{+T59kKEOji{hks<UYHGO7`b_`?OrB#V{ntAWqmS{HON;RVh3{Os8UzcmQg<} z%+anWs9?$%0rV-vxbqCmo483cR))oFT;#E(pkjae7JScPj-xD&4;?GQRiE(EIlw(k zRdt|yY_-Oy)$@>fe*sw{4Dv{!Fz>yZy`%QChY?9dYR@Xc>4M^Wd>Y{pR(194`cN$k zM`wh{IX_eWY$Zd^Px9$kCxoYwnhYSRyHhJFB;1Vbhv_>N<i(4w^tB5oVEM6+GMJS* z?BzO@kgwIy+L>I@JMTKF*lMRfdz=}k^{>XCY@BtpdsH!%7oV|}X+V&*a;;r$^ooFb zpGgte4nL@E2<G?67mgNGT++XvF^NgkEmYnduXC-OJO2En^tzKTO!>n|k*j>+1k+e| zs=BSIqeSBn+gJ)XYiD#OG<C$2uadvOeZQ8=s(Wj?@wUg^G%rLEFI`92BnJ;`V3>bR za`rCf(4Qk*R?M%LC$>3dcucm}UZy5ND!EB4{LzkEUeW~@@WdmLa+kXzNO*V-)7dB8 zpNhlN4lj7CH!>e^_VQ`x1CiD5KanM<SvHMdEKd5IWH73c2WBaI&V2s^(wre>R0jou z+cd8cE{;m{Vtx5>R7Cb9uXGZt4lg2YtUwfGJ(1kRwDrYQm9$dC;Hd!(v$``C8+2bB zN4n2W%gry2zudhCu`(dlFOn_pR3r;?7R`DZkr(ooBT&w{U{Izw2&79iyWXt}7qgzX zi0HQC@Q~1#5Fhn}6)hs}eyje1rjiMiRApUFbJF9jNI;KqR%8BWOd*@j?kEX#iwHlZ z-Yx`tmg;+N@3JV738xFTAU(954AQ;gMtLYMHDu7jPT+1<h2N!^ZbdS0OD$g{A&R1T zhtx9RsgV63krfvIo2S<NNU~|C;oB+c0aPLV3`dxwv!==|s>vryOwiZQV*OD4cTbAS zYZxpCA(QKD%tEs!kNqA|7>B?;LuY7<t}UL+Lw8iC9_!YgzT|1Q-k;qJrzRq4h2ym+ zyG|DPFc<n*$DvC+{ZiuYQFPE}>Fl#$H`3=4Z!LQKRyMKyy)_-9Cp<@MKG|v_64B?s zVBKEjd$Y*=y%+qYdoSj=Q?HD?H2bVMMFpppq@WZ#bVqBHBy};(DL*{;Sck&Z?j`@J z(FEH8;gIRh<I<Ti*A&jH!NzY6C#G}zMxO$ZBA~QXJt4v5ed2PEpaEf@Z=TmpC)KBq zi7wMJnobYQ#P^K2Q``>EgbxmwP$u@x9qz6SC7iB@Hu@^Odd5S^&K@so-OjON)vtPw zZv8tHci9=w@S>q~?lwXHeF!tOGElU&vi_3~Pzt`=5(pz1IbfUs4lIv<iRoKg|9l%! zYPOu=!l>V6#h4`(npMQ`E_e<SJQ~b^i>mnaD%wOVhaHv1jPCNH3u{r>OrcgR3R^)T zTxH_2sLwUTu%5NeHZm9^Wk1Wofttdpk)t6vgh(5;&`Bc@KB{V5Z<OJ}E0SF)4oF6j zp^wGN+{HHu$38rnCB0&RoZPJ;qo+DPA1Y`?E*ux{;z@5aunuP9Y@uTc3(8NuRxdj; z+LKbKeFSoad_zeuN4n>sG^d3xb{<rk=H^ovb(jv8(BiWiI(WDS?w2X&A6yhh-9m#% z(9e->66$xF!wW>Ii50u@Z5uo}Q2_5K88&)be^3sj$UywG2ofQrwS>1B9k}11<kOkC zJW95ZEb{3jNjJ<Fq5rilPt;y@d&&yoM*8%FWd$<MwJ<5{=c_|_bs7`29}7Rt-`5Zm zyBOd#n?tl_mJv#KvsPScN*PRXC_W-`{EkGVpr0l<R^c58ogOF)$xXM6-kQX^a)2-D zIA}Y+M<l<r;r!jzrM$Co_1qcveQMNhe9`rt+u+_*AA^tcqPnoCEx{@f?%$(kU}X!v z30n`0a>S`SB#7M2jaE^?$;w5@VPv6xim+(3ZDE!;8COr=H@763SeC(PDI!w3<Z(}n ztcT_dZYxvG4o`sw&ub9^Do=^7Jm#sR^?sC%z*_c7D7QKCv1hbRw5`QLyD2;&eN_`F zbG$7tqljd*(&iZ!c$GmnU9@_r!f-u(R;l$Y)xw+MDc?JKDtruL3yep+_oBWi?KKaR znGcbm$ftYZ0fk8vo(r4enr{>lmh~o-#AuLKv4}<bT&yos737&J$!X>8if2>iuT40P zD&`rfl7UiDQ*F^P%KPR<bvMA!Besb>rjK5Aa;NFxpBJ`B?>TP3ooRS2sgWExZFUCx zihrWm1Zypbx_KwcaQhZkI7o$d`n(d-KDV=?3k-1GKIV))uH7*L;GU&q&LS!FULNh6 z)YG}((O2pN1EkBs-r2jRF_aEuuLWNWeb}ZvuUzVSo3^{CxB5pb2y$P|e+CPNJ1iJ5 zR_zDR`FCbAG)l&*g$u)L_mbf0*Fi!ea{~6^GPy+_n`gE1)DF3ib-_v$*PfP|jB?sl zd~(_^zi-=D3!Alv&BH^K_so-Rk)BMW^)<|=&72P58&jhxqe3x(S9iRNy8~9bE?WDi z;orIAqkU;xKcGV`fn#r(6y^fv7EaI%kMBm$iw3gr0iNa?Mr9vLqVrvWJHLrxXiYFJ zjCHH|yjBlHrl*+-K+++40r}!H(n1Llj8YSk*jCi9J5uZ~>z!~jsGnrsyhs+qGIXtf zXcqGYN$V4Lx+lHWMc*)8PZ4{34+(+0&ij5}J4y*BRiH80_e@RSH*a1=c?$nZWUtW8 z<mu3=OQO7s()Zc6j#4z!p}Z>kd~bOt;j}Ex4&vC5c|nSB3ElOoo}HbR-R_IEn$73E zM^A{_1QThQ=Nl+(uX!bd$hfIO{e5jSF&c}E$hn9Em6)WT%nsw<)fCn|(!*$2fiWy_ zcYuKZer*Da$lun#T%!Em;GdQA2it$MeiX+bSd?!W>i<;wC;NYk0sxABVSZB7|DUoS zB>Bzyp{xm(PCqI0TTTD8?gwdpvwi>;ZUFyo;GZP=Ef99u1XBysa|8Q(;{H#2e{jt= z>xX(~*frH(ssD?4zBT=(YW>MQ-=YBk)qh}rGts|2cC*W$9NO<)Mv#A6@DE1q_YVKd zh1IY$g8heEw&Fj{|C6uyt$6<Lu;1*(Z@vFMGydKn1y;midhLEqkiVh-%jEko9`RSx z{tdg)_-A+jtr7Zn*k8T=PxPN%<~N#@@NY_Qd(NNmKkMRec+TBlsd9UNqbP&;^Q_^( OrjQO;M)DK?5dS}-ws<lC literal 2507 zcma)83p7+~8y;ygMTjnP$utrn!;BaP^NU<^X@*3GT*ffN+?p974#%O83Xx-!ZVF>a z6e-Cq_u~@MMJ^G3_sfJd)W6j_%b&lq*1z}K-@D$mzxVm}`+o2DJR55*ym31S1c87g z&gh?%d4fULfIuL)5C|jz*dPi#kk0h?qSO5~gQ%2C@s)Njw8YH9mFLoJM^pT`@21X) zI)#DGb&>O^6KXA*E1`!?Ca!j0tdqq&9yzEnd>&_25?N;|v?(|tF=(tmZMk}L%WtKJ zZc4!wVY`wRPhQK5)W26$a3XqlWJhFfy*^(m()w*}!<g*LO9o1r58f22`pL;_hb~S! ztT>!^s}Q&BLC<bI^zx&<`G$ve`2-(fX}QLM$(XFCxfcq~&fQlV_x*hC!nX*iX>!G` z-S>s`SFlH17yJ!6ay%$~@6fl)Fn!@d*q<^s_o~+$6b6B^wt+x0YZ<dhL?W97|8BHb zQ!jJwp^c#=!*;}t7hBnclU^;vR3F==umwVL3L#&)?R*VYuPcA|)lq?@05vl8tiia! zPRa$Wa+z25=2}83XPRYrgI@dO8`ILOP=p(#w#TLNTptWtr86)vdAq1%oNqVYG`R9% zDnu>ZNe*P&W|Cc6ZR(#fztF-zUJ?<-jbn>$_=e9yc*-gcUCSz#a!xK)F0lCR+dFHS zU3p?!ij&CZLJ>}wiqt6HjLN|V1@8KI!c+Hs&FPgkdzWE?pl;M#U*XuzFr~QBre}Ik zekDUo(dKMq2wZ?JJ-@W$wDnBqhGSblE7wP9w5N+j*f?)!&Zdg`l?nya6GX=Df#(Z_ z=8GT%<jhs@=NzFD3AN-<UV8uQoiqNzp*q0|8YvlBQ%s$79n;=qD0%Z-ZPfM<Y+88% zCB=JtL|bUH$&Fiz>E&rHeI+qa<1FJc$_?8(#5KD?N{Xx8Y}n11QN1KdW$vb>R`>8| zgyWbib??VpkWBj*)$*EdNtj}7=`Y9RZCR`;v#8ioPuL5Q&2&;$n)^_f6tBD1Y3b7m zNGKj5xU?~@NA1JWAy*r8;Xa7QEX1Y(rOcT;p)JuM@8z*2$87oox1uD_(!+sMk<3Zj z2YvWw{vZxs?v9e7T#!z^;;`7RxY#P0i0d;6AmiN~q9=~8jAn;XZDT6RA<Nyxj|UFK zE%b)fRFt32TewR7BlE^vl{Y=3wrUNj>TVdEbPhHSiP-p}KaFcBNRjH$d4D|-oGXLS zPLzOG*)H=%GXK3hI8k=4K>t)~l+kO>O6uf3r5+h^)WyIjV1@>7Hds-sH3K=<xKY({ zc~LzV@kmp4a7r0DcU@u;H)|g;sO<!9)2n;u*4=d<_PVR+p*kM?F+91rzcyuWvt!1x zk7)C!mYR-&S)JFBY7z}iR6ViV3C|lePwp<_HPRerZk?>6PSknDHz#LYDm;MBQx(Kk z%|D=yFQkYEWDu>UlV2)ax!f%A^`iECj%~LfGuLA~*VVV1_hyguvz8Kf?XQ*s1HMVk zvs_5Pp6Z;s*u$MmhK`}kp={*JJujE+Sf%5gJ9Cb5d7`fpnrfXnP(>zQQL|&&^!&K6 zPnz$5tD<^Gd)xRI2FK1qC_g@l?0RM43&~O5n77o~g~nP;W1}WGgjOi(zTZTGxgWX$ zYcYbA+&Xhz1}#HT**&JZ0<NYcZLtWnqSgUkLfuNUxOga@bLTR`u|H7hy>jCY5q#{O z%TfU!o%4hL;*|De6?>T})MWQ_vb0gXcd6>=$zkCmoDb#6Z;K~+W&#H=_r<vCgWd*< zZTX*?hi2XXCI*1|04h9#c5zb$XmIrr1&G0Νpgnyce<X}KW=-~{LyB9ld?(~Mx+ znp!X@kw&0<lWD$2umHBtK@<#QY-mrUc(Q>REE1W)f&w8lmJy7^W-|=na2A0?q<XS6 z=?o$bc;iE7Qa#y#%k+gaJPH1uzC<`eOG_6Hthj~&`xz5@JeWcJe-P<DK4b#Xj7|uk z5^3yT0K<Qt0fQd*Wcm`>Mlkp~4S+$dyp3S?hrINVT8F%~brD2@4oaU0gTjps;XmbT z{C7crXp7OXNc=OPkQzWiWq&a3kDC6s!hEmqrB<|fm6q71ex6w`*ZRPYTbUXfxsGtj zdv^Gj`*BjId<p%Qt`RZd1esI~!DCEAe0SOVQ<;hB^-Gl`asEEAxl7uaHic*c_Tdf+ zuP_rARq{bld3NWqM<Z>~9(pmS7IZe>l(9D0-&kT5_2#K~UV_NWZ9Zg|TLPW_Ztvz- z?U*|_zl2A~IaH!J#=}r9aK;D+>Vmw+_5c3otkJ?}HU7TFiiLyGXH9sprVd?qwM5?< zS(y{cVJogi>(piNaJ<1ZkZTW+vov7wrgRz`*Z}v{ePg+;3QJz15umA(=kmRt9GUe# zAbHrT*qIdZhp2N4s-P=;4rPs!16vGi2`W<s-!Zu(`<CA;@R?{`jHi_wG})!rZ|viz zlk`pVt4Pb8eC6@sl<DZHp@Xmj*nB5ne<VocUw-B)R#7p}*=$R@VWy3=8TT^ta0#tj z4qNH_=}E(gm@#`Xd&mnJ;@mV3#!p;V(0q!_89|u)C;xSCs|x1p?>Z*_84NMM(={;F z{mDwrF>g=ec#QVXJXx`FgtW^SxlbYLxa*Fl4`Uzgz0s(&9|;RdIe<CpH0Mz?Bimje zZia2LTb?e&f7OL_w#q&SVXcKWY#09LKMtTs-><t7t7cvAtq)!=-akS?pt})sYr+2` z<!dOvKIv;HqTrv-{aWC!J9;h9b<_86|A+owkF&l@)^M;Qzw8;TH5ga`1Ofy15umjg HkX7$LMF$*| diff --git a/studies/Data/README.md b/studies/Data/README.md new file mode 100644 index 0000000..e69de29 diff --git a/studies/Data/isa.study.xlsx b/studies/Data/isa.study.xlsx new file mode 100644 index 0000000000000000000000000000000000000000..39686b0a2b8ac5b7eef038d817bc4949180470ea GIT binary patch literal 7491 zcmai3bzBr$)26#wI;3;yR6r0!y1P528>FPW1*IEFkxpr(yJSJS!KEAhR(aKXz4!gz z^9Mi9GS8XA%$%9oXB1^%;ZUHUppc*rRJpa|LG8#|&`?l^a8OXVkWY2Q?Co64>|6{~ zJsr%P^;kV@ZOW76>^j-8f-i)`*IA^|^i<WvJ9BFW@rzHPCOt%fq@d&1Z6J3Xh7u-< z5<^_(H+<k1ncYMRb<t8YgpvYjkxIM(<oO9kTY|>R^tQrSsvIQsmyY<(@5OKuSM3Cs zJoAInST=HF#)YM%%%ag1HwCpTR9U&Pu6Tj@p-#xCNf%u!fxwJtOA}%pOr@wI&5nWY zb+f_PizhI|IQ=L3b5%~zpGovY*PC>8&O#H&k%-*gh)ZgVpizSar7W4ZphG{nXUdMD zBoYN(BA}P9&-6v34zg_}e=<wNWz<vK!Fk23fnIOA4jYh5%)v{yH&X^}i+Fb{FI%Pa zj1R-<6x@$}<+JwH%O_~05K+?6BF;|ee558;S4`v#8}>m|8sg=F&q|`6fz-xVD}QvA z5!<AK>Se(rZ1l@-dSBB!>IBYgf+%+f^Mr6dUGKg(7GMtEh0}2pE?Xs0{EsJ~es}_6 z{9g+J@|%H^nT<2sPnmmf9cM|+??C=00|Nzx`b&oOyJUGnhkO@1nh4lieElZXP7*1n zpqV0nk&gRuJB=MlZFF9D;1_UsoCsR96=ICz{+7ky-mqZwZnW|_DH^bp7yv&w5<1d8 z_|?i0lL-|elp2xKAs;Pm)BH}BTDrKuHC78xvJb&4{xaLOPDt)Ga|~cWVbDsHadc`Y z%&k4<w4?&;a^3I08B|<1A~N4Z`}U4wg2%>A$9^aNz$ll(Li=`r!Bmzk{H5m<I6XvT zGlxlWY)emlw)s}5C{uDQC%Z1)5?iHMesxg0&6O0ShrBf!B_mq@t&ub}IDpz+3)$qB zbAq04%p2;bMgD0uScpYDY<`*Sp90AD0{>>&Ae#5;-!0n+u`Kal#oX<ko*UcSKmYy` zej3`kiyiAJ_zva$g|?fvG-JFo<4awg3)rBVI$0YjOFrAX3pEl+QluQjtnRY8Hy<VU zhA;TT6lA9m-(@pNV{OY-wt@lP+@Ko^tnPUoA-Gr+I-0pD_dPyR?kzGWADVbMK}Lt6 zBXMa?3<){$u<a8$l8Ltu2dDbq4(q!!Xqa8mOfJCgqAI+vMwu8<r$}X7^=lCWog)|o zK8)JFk`AKD3>ji0&Uj=BFQI4pAZ~U|L#EgHbC2_NmLNu{4jN{p`m&4Y%?P9LVN3Yc zCH~M_0=og_K|kd&(K`3F`0&SKBG03>-3R~O&=~gy`feI$OEWW<Kf`oQOWvgq;*B7D zC@8{z$o_zEe7EPTx$&4ZegMMq&13jgUj%QPM5&~FYk?AND~*ySy@T4&*WFBprs=X4 zS-xcE^pAENrZOWX=RzI?-oYPUAH-G+%kxH!T%_qPBzv0Q%!75?me&3JZakb*EKIoX zMruCPr``6)T!9YK&hjvqJ_gou$UxsU^kZ^hl;#rd1s)7}>>+}Rl6yl(x>eC`j7bA{ zns>4*Sh9hoxm*>aS-9#e?wsC}4fCIGuID~IejEx1N`2iVOuoSWA>Z{3-0e!|EsVie z3l0GsEo&2G6^;q+?M`p`fCD`(C(XWL4NS3pQdv>+-r8N3Sn37=F8m}63{I*v@fod} zKR3F-Z#8GagKUo7jn6(XoftP~x_#w$J(lTntK;~{>w6ZPRR(%sOiZ0&@s&_#79_mi zUsr6Em@@17z*}HiPDt+{^UHx|(M@fC?T7T$&rhrM5Wl@H5MY~Nz}ys!F{U+-VYYu# zBtN_5vs6H)JpFp0$hMJH9^|T2WDwD`4m}WPumL@Qo6=HzAY>PK<t!z?L_3gp;u(=x zq?nmHI+-pQeSGSz-_u{JsLzvCzZ^9Uv%%s^%)H>n=T{*hzXVqN?DW{azh7t7^O>x! zQWZ@SzcN<5psr>LMPUn+`Z^xF@eC?H^-D5j<5kpTnwN~o#$QnJ>0a_68(*U))4zNY z4v*bvC~nLb;7_gT1XoCnBoGiRVJIYWcTjx_H3J(&Zny(ohKMB8J^`x{OCgF``RAva zm{?x^1&Ed86sJ%!m=NSPbTWCnm;V?nP{{vGQZ&|dCtfp_=2SA7Jp4>@*EO<0!dNI_ zjN0>^g9g|kh`fJ-`aQYSc05oBXD|N$-6{o<_e|tfl8bLE0fkWZer<i;Rh}}GNnm>c zz$Dwd)L~<JnqGjJJS3rr0tq9bh=1QA3#ArFm<Xkg<!O0w&*c5FEynasCXYU|6ph_v zq0%j<!t&yliS6~`&XK4-wJejs{oA&P>Qogdgg2vdS5_oa>Kyn9BrG@A>^r0ll6|sM z5xn)P-AdrvXkJ?E>$1t|qRfur@JNU=<M5~6n@eSeOy_S^NBpLb8>Tv}?pzWvA9ow- z$|4qB`}tT~f)U+)ty!k4wA(p1GHt>tGOY+z3WVNe=MO&Dz1{rg*zECo2ES<44Ekgt zaJ9<6<qq>_rnoOs?eVx!Hjp%W4Cyie_pjgC(#XloRK>-~%Fe?1`^(QrR*KDK$NIPg zR^e{R^0=vN)J;MU^`cbb@K4z{306u=nt47b{^9PG$!w))<bLA()iD1F*k%6HU~Tp` z8<G@%n!pmB3L^BNhy0w?m;K4<SujQdN;Rp<h~XDiiVOnqpsaLo4RfONR@u}>=-$9S z8hkRYFLWV`1S?9}se+@^hB(y729XHfluxRB)*=k<W_DDH>~O{wJ>sjVTSm$T77s3| zmN&-<E^QtejRR?oXPuy2rW2IlhwS1paNVswFY&%yNQLK)*=zd`!8J0JKNXHk3JvJf zX3YQAj%B7n&d{%j2B?HXZJ+{-;S-QmPnp9^lt(&*cPDz~r0N#8xKT~7X7_L+``PZr zQee1S0qA>S`onW-nqK31AS$Bdiufi!Q8}ywV(4X{+aa@wBQWS!T;qN|@T=%*SLm{Q zE?KLDJ^Mn*Z%HJkvxmJYtkbBEZTbyr4V^N2SF*2_NqA>TQmm6nH;*T#Rm(A~Gkuvu zVGW$m;srE&y)-JJirh!rw`jsrw;E!2ACekGNqk16p@TEwlf7>~aTJ57G)pY63ga?} zif&cW>AXT?MyvVSx4GQ`rCxC$hc(g{0(*{0&z{uqlDTx?jq=x?uA*~9o<kG4HZ{eF zQ7!5sfFns7W93YCGA<Vv+F_?KP|Fi0_Nkp}w?`SAz+Wp_q1jlk%iE{KwCc7I&o5%W zsv5O&glsBe*{TRB$?27;P58C-UaOfFH|D566%4Qq`*1?21gA3DGvP0Dmz=9^>^^fg z@7%dqFAn;$)i_2Bdb1X<7{IcXvtlgYLB1o*VQnwq>vOZSD^Ys!D?0ycKj30%W^4AV zZMYBOgb^P4C`cRDO>y5BL;8w)u^&HwRIdXq1$z)L;2TrGmkLvz<E(5uqd~VxUaNrK zgE4CANqiG&Zb_zaB>>_4Br7a}OBXAw8_0-I-4I@VzX}sKyJyBHOi1c0^}K3Yoz=LJ zBUD<xFZk%=1^EMm?$TKWF>6v2R-{O6^Q#D3V*cY#aZlHhJ?+QXcx1&S0qEFf*miHY zfJ1w;6}c=#%UC27WZ^7RoUn{VI?1%&gwIw{(pHf&qX{wNgIFKvh{ec@JeMqwu$?W# zitpp50t8I7j)z^&5z;UX7aWrZzmtqGZ7;fQSFjk8$)8)vPrnU&bI8CKgKhhUeKalu z?XZ{|Jt8Owk!4a|xX|^Wq!y4{CBjQTt03&$*x2r%EVXfY>A<+nU!n4x?M#F7F0Dbo zRby{D`C1w3*hkY!E3aDrDE%QX;y^0#;-Xi*w))}2yJ>W9kla^*w`;IF@8R8t`YI!g zB%Vd|7Lb)&Q;Dy8@H^Y(J-6n)DLii!6ozm*5=&6IT5FGWN*s0`NZVu5_4;gp^n<8Y z<MzNhwHDwx<s;)3O#i7y$%MYRYDRL(2M*-9>v6ayp;53%EsPPb{f(N9FAkWz>HXyA zrG+Bu*axI6#6HXsN?#IoZkaOaY_g15Z&I)swD7;Uz1=?oTPS3v3w?Ecd{MZ-8`108 zI4kUPGKy(H?Dy?_ZLer)_lPRi?`%-x-Pg+#x!nTMy=IGdnZhrw?q>b&ZdGCJon2ms zVoHVuJ0fNgx@#e!PRaUXUNuXh!8`>Z1E7>)_xnYlKfj?Y>2j^K?z|!8494)FeyrR; z4kwuW6t;yd2*n|VW8O%&IUUrmd$rL$t||O5_N1X1iEMB4e1>!(vIy(=Y0(5~rKodO zK>awOgeGx67jK6q^|?P*PepV0mTYA_H#t^bVAu=|tP{hTD2N8GZO8?#V>-gt4NOAz z%yqZ<Qd7LMPDHf`N2Mo<PZ6l})+|aiRpJW;(Y7S{P{TVyUCF~!LKXAV5KW|vA@VFn z7kjag@n)Zt5mCH;+Be&A0%7k3s|M-x{X2`3xxxbk&kJ9Rgi)DfNwFXoeu>M+^c7Lc zX$Ip0muz*G0@3m@<JP5yTy<c$T!lxrk&H5qc`;w@&u=~@A<OV63p&T>1!QDX_V%`> zJrYdg6xz{&HhpDVRz1v;-)lV)Tj|fqSp1wjhE^2$6Re;)ytkXh^KgrhT&yFiVRt&~ zzSHHT$-)VtU4<l7X^$$aGn(_KVIu{u!Xna|v_3hpvqZ6<U1MJkjlbwyc29oU?P~`@ z!|J*AC!%0rciF*zYTPPWm7)AzlbKJM?_dCZM!OX^VY|F%RQuDKZ)(+69-bh9!7iX6 z=?ta_>EZ$Kc@sm-IY&B8irTQ5+KFG8b;S9x8H=;lpzaK3R#~#vD>n77ED9yGK<^DM zswDD0q8<S%7H_!yJEI2(lk!Y8-d)`2*$=biJE9*-gAgo^QK<q0q2!EGogQ^*4ynZk zbC|(M(^o6HYkT#qu#%YPZ5*!_&>>MY5{FJa=fDU~9eVcsY#P*Da9zWP?+Ig9a(YYS zlvqqtLX2~^dnK845NSL<RcV>D)ZIDY36DV!OKabyM{feRf<P%Gdz;kH*79N{la;fu zX+Of;Q*NtMusp6WI8B<0AOxC}e6NdOf6rV6yc?1Hto4Y)+tzjyfw)*`4enfb8lLBI zYTXd_?Ihf=1IqQ&gkmf*36{weXR$T~^EXbE&603LK%b;?VHzCgzyvH~Aus}QdBZDa z*P9dC*&1QtL9IRc-aPDcvyQ$J)KN~_`Jx9%@{fQB6Nnv$-Te6+ZISJYcCG1z4w6B- z+Rl<5tPL9Kx-uFk<!^QC%IX+$C!vNFlDs)bm#tn%(bvZ$?Oc}tGmtSq-bM1eF0s}m zbE9I!c^To~c!YY#sHu$(Om8-scln+$9<3mX#%PNvl$R8#ySV8ZK8Z`KP&cfX&fyb2 z<kpPUX4cRsY>n2jb@PCWUkbM7DO5IP`zW7tdyan*uSF**LGWoqm6(H`<s|2zj_lp4 z7j5&(xwQQBdl~e4J(gO%dhIWbX3d~tL72Qg3Sjf61)occtdY%>^PS7y9-jt=+IN@0 z8cl6Poni+sM}rE0!2$^!@T5|Ghp%*4zI>{z?uNDqWR;YvU#_w{)8bXXbhhxU`dc4& zj7sHXg_nH!EW`9briPQXn`qlOa5_V};9v?AojK{tUC&eIebmfuKd?97_V*49Ucg`9 z4bp+ZLJAoAe>yN1PaCrztye*ko_!iSAOw8D3|vxyMF)oaTU$k_7N;@sB)aYSOO;*2 z5>3X--GBuUaBy7b3(p5WzKhK|x#nuz$roW8;?^w%AZk2*CX3g!Zk@9FE^Q&rWJ<F{ znyJQ^@o^=>lIeqlUUIk>^IYcGSgMJut+i(fafNeSQfbV3T=3MdcoVd@Q^~BXyDUF7 zNGZpSU6@cYYIslp&5nezq=ub!UIZue)OruX*N0{X#{-iu1Pag>Q5@#sxwNxc!{j{5 z#$?__Na_<VZV%`~CvBIk!oN6heS#l^kBeMsPYsWK)Md1yrEFzJqPnfFHRqEp5ISg4 z&{n#D{?M^+AVCz>HZE9ss2|RSsgXaMfJvG3Q4U`x!V{;t2>n|Qq$k1><0hRfc;3$& zaQhW=?1`71DCH|e1(CF1NbKGyJ#-l(w8!9it>nOiAe(g&yPuI0N)a_ecZxo>Xsyzz zmVUm*U>0upDi}Egd|px8L}xdqJ+}>He7I;O9xO&~5e03CO4ps({BEfP)lGwPdO&yn zhO@`vXmK}|l90Fyn#+asTRLy$Qna|9Yrk;LjVQrsV#I<};br6t5@XS9+ri+CUCfY| zT5gH6zEe#fowVQyX-h4c_cys;uQGlelKwn!m~_@>P(xN-c-fJmiq%Y9R*e~bpfg3B zxtir(8XI}0NABh9$8%vm3p~ajw>}WB22Fcqu-%Te-MOAyFCCeG3`K}Dqox>)iX<Br zmeY<HeH3`-`>p-F@j{&NCM&o7;`q7np*cs!i<8So$Hxpvvqv_r1RLXTF1DlFf)oM_ zIUlgFq{uq-upZctsPWTme>KBi_rNi|{!qR2GeG~n2{W}fQFO9*_>)Iiy<0U1hLDqV z5H0}=634&9j2s+(e2YkUVh>_R3kKidg?x!luV4rfQRm1f@zQ(`B?U6+7;piqCZol3 zB-~y1W2`>1RcIATz#PqDEJ^cOtHPlEw2aD;wLv9jFKTsHhlQE8L?Hb18QKRp0zv#$ ze;T_}zS@YKo<4j=8RSe9C9Zl@#{Q`}sNspZA}Ks$*z|#hS2-nR9UwzDbg$$TPhS?t z(QO$Z+dPLNDJ}t-A5RTh;a1+q;T$<n?K>MC3v`ETDIa?|Gyv0x?*aNF+9I)60W+3v z<Ht|7q~&F5d6ymD1h~S-DE8NfG>L7y*TIlTmBtW|5o2&r->}&pKvI&(+!`P^(6XbN ze?F!sTTh1cOeO4{Dm5n~q_KK<CSTh+ancdj-KZ2pA(cgK2NcL}n?|AX6BW$D!9+9e zv_#Gfci&inUEMJOucqW!brg{eC@lttvYdZbx_kfi75The1AOmf!f2Yco>(_$>)<7l ziP$3T{I=J(DulM4<Mk{4ZH3h%k8>|yp~1GTD+lDF%mjOaif^z#J$v6bo4j8Z)Q7li z4U&Ga|LQUmdndCWK2siq@RV3BCwTGT;zU(dtb$^=xAaVuN^q;@`?k+h=aO4#hnLm_ zQ)_bR?F0m>*L?VONC&Adq4)CDopBVXaQs)bC2KPh>(4k_s9%Z|!ZFtbJS^^r7xztc zOmwnc>9K~zr)_8_VN7x2q8E6YsJwfLA-zebpCi~fUT3<Uv#8v4nQ8lyPKi5@mI4=z z$QDhEi$CGB@?pmWsm(Ysl6;Oo4&V{7g7G73Y@3}5{F<RRRY{s84NOAuf!EvX6lEpW z%5plzVBx|CrCYOZQ;H?#YNV2x$eB*4Xtl#jQ~Ep7W~ac}Lx%VMEpit*u?s7EB>ZkW z(3hJ2YwE<O?z?@FLBb!&ccpb!1Yf*NmEi~~uXB}%?h6cfsC(sXPZJjEwSUI;>a6*| z918m~BYzP=Vd&=c$f}jb6Nk3m2oNe&6O-rSl|@%QTDTQ?Jzly0;HrMDKRXM&YOwi7 z9Y7J}7T1F~!v*3D%=<d<S2{B~LC(ID9W4a%e3ZV7;S<{6u}sv+t#UdVHp^2wT0Q-c zlrQq_Zow^s+i?Zwqx|h<ZQ|Kx3Esm%%!$+h^T{L}_5~EH82Gz|Hib+E<anZvBp$^v ze3-=~%)Q97qQF}EfGavcP0aX)6pjaaYH+cu;bDH2esx%dFkOeIA=MO=QmJWj;ows) zt8J0KJ0Ub3wS=zHjc5d1#4yAjsuIjv7o&VDUrc!}KkJ1codYb#IO*QHP~}|D4?@Iy zpPm@a&h9(Xteqp2(_Zk-G09_}F&`H$0p+S}EE{z-lsjhGT|3in44X=G8I*wCe3aB- z0mgD>Y=>;iBo`Bs!6PSv8I$?9FTykEVVGv#Hb|&ZlP$S4N>)eV0Es%lErML1jT;6J zrh^S+u3ed%xQxBTIV-_fB;)5<6vpeo!0!0$INepbHoQoTq$rvZfCO#Ej#Mcyqx_mA zZ{!$JQ3e_Y0QKJoEf9D7W&O+8<@W*q%$eWk1NZBDT8x0W_$R0Rccp*M3x1-YpelZ0 zevsP#pR(_?`u+N@tQ`_dKS=hUn*Mpa?=<`U`VLsR2mJd5|D@f20wLomh+2rAd)Qy2 z_J7{@J3D>9zN=@147MQW{Ee&rdG-60^(R~X6AcBW_7CiD?)taI{+;#ueToXwj~o1h z1N(i#|1#kbB#eF$5Wivn&Q$zXOb8hf{eu1GE&e><&u;Pe0n?BS4mrU2SMT^6`oFI1 z24VhwmFwTI`z!xh*MDow_$%x`RsK)(pC#ruS`jjD_@(q`!TA&ZXHxtPUnKbP@BTH| W6lLIlyfrLH|IrJHNDHFx;{OMsrJB$H literal 0 HcmV?d00001 diff --git a/studies/Data/protocols/.gitkeep b/studies/Data/protocols/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/studies/Data/protocols/DataOverviewAndPreprocessing.md b/studies/Data/protocols/DataOverviewAndPreprocessing.md new file mode 100644 index 0000000..0f48f0e --- /dev/null +++ b/studies/Data/protocols/DataOverviewAndPreprocessing.md @@ -0,0 +1,5 @@ +## Data overview and preprocessing + +The entire dataset consisted of 25 plant genomes, for 17 of which genome-wide ATAC-seq data was publicly available and for 21 of which genome-wide ChIP-seq (H3K4me3) data was publicly available (see Table 1 and Supplementary Table S2). A wide variety of tissues and treatments were used in these ATAC- and ChIP-seq experiments which are listed in Supplementary Table S3. The NGS data was downloaded from the sequence read archive (SRA) using the SRA-Toolkit 3.0.0 (https://github.com/ncbi/sra-tools/wiki/01.-Downloading-SRA-Toolkit). The reads were trimmed with Trimmomatic 0.36 (Bolger et al. 2014) and quality controlled using FastQC 0.11.9 (Andrews 2010) and MultiQC (Ewels et al. 2016). If the reads passed quality control, they were mapped to the reference genome using BWA 2.1 (Md et al. 2019). Conversion to bam files was performed using SamTools 1.6 (Danecek et al. 2021). The Picard Toolkit (Broad Institute ed 2019) was used to mark duplicates. The duplicates, unmapped reads, non-primary alignments and reads not passing platform quality checks were removed with SamTools. Plots for quality control were generated using deepTools 3.5.3 (RamÃrez et al. 2016) and the necessary genome annotations were generated using Helixer v.0.3.1 (Stiehler et al. 2021, Holst et al. 2023). ATAC-seq data was deemed of high enough quality if the average coverage enrichment ±3 kbp around the TSS showed the expected peak and the average peak read coverage was at least 2.5 times the background coverage. The quality control for ChIP-seq data was performed using the same criteria. A detailed data preprocessing documentation is available at: https://github.com/weberlab-hhu/Predmoter/blob/main/docs/data_preprocessing.md. The plant genome fasta files and final NGS data bam files were converted to h5 files using Helixer (Stiehler et al. 2021, Holst et al. 2023). The ATAC-seq reads were shifted +4 bp on the positive strand and −5 bp on the negative strand to adjust the read start sites to represent the center of the transposon binding site (Buenrostro et al. 2013). A detailed documentation of the h5 file creation and architecture is available at: https://github.com/weberlab-hhu/Predmoter/blob/main/docs/h5_files.md. + +The species used in the development of Predmoter are separated into the four domains algae, mosses, monocots, and dicots. The availability and usage of the species dataset for ATAC- or ChIP-seq is indicated by a check mark. \ No newline at end of file diff --git a/studies/Data/protocols/FilteringFlaggedSequences.md b/studies/Data/protocols/FilteringFlaggedSequences.md new file mode 100644 index 0000000..0e4cdeb --- /dev/null +++ b/studies/Data/protocols/FilteringFlaggedSequences.md @@ -0,0 +1,3 @@ +## Filtering flagged sequences + +A naïve filtering approach was used to reduce the noise in the dataset. The ATAC-seq data showed high coverage for non-nuclear sequences. The transposase cuts primarily open chromatin (Buenrostro et al. 2013) and as such also the chloroplast and mitochondrial genomes. When the organelles were not completely removed before the experiment, the data contained noise in the form of notably higher coverage in these regions. Unplaced scaffolds were also observed to contribute to this noise during the data quality control steps (Fig. 1a). Therefore, unplaced scaffolds and non-nuclear sequences were flagged during later development stages (see Section 2.2 and Tables 2 and 3). Assemblies on scaffold or contig level, *Bigelowiella natans, Eragrostis nindensis, Marchantia polymorpha, Oropetium thomaeum, Pyrus x bretschneiderii*, and *Spirodela polyrhiza*, were not flagged. The flagged sequences were filtered out (Fig. 1b). The information about the assembly accessions of the unplaced scaffolds and non-nuclear sequences was extracted from the sequence report jsonl files available at the NCBI’s RefSeq or GenBank and added to the h5 file (under “data/blacklistâ€) via add_blacklist.py in “side_scripts.†The flagged sequences reached around 7% of all genome assemblies used not counting assemblies on scaffold or contig level. \ No newline at end of file diff --git a/studies/Data/resources/.gitkeep b/studies/Data/resources/.gitkeep new file mode 100644 index 0000000..e69de29 -- GitLab