From ed09846983acc36387ed23b708faafe82934ce4b Mon Sep 17 00:00:00 2001
From: Viktoria Petrova <vipet103@hhu.de>
Date: Mon, 20 Jan 2025 12:07:44 +0100
Subject: [PATCH] add architecture and models study and protocol

---
 studies/ArchitectureAndProposedModels/README.md  |   0
 .../ArchitectureAndProposedModels/isa.study.xlsx | Bin 0 -> 7510 bytes
 .../protocols/.gitkeep                           |   0
 .../ArchitectureAndProposedModelsProtocol.md     |  11 +++++++++++
 .../resources/.gitkeep                           |   0
 5 files changed, 11 insertions(+)
 create mode 100644 studies/ArchitectureAndProposedModels/README.md
 create mode 100644 studies/ArchitectureAndProposedModels/isa.study.xlsx
 create mode 100644 studies/ArchitectureAndProposedModels/protocols/.gitkeep
 create mode 100644 studies/ArchitectureAndProposedModels/protocols/ArchitectureAndProposedModelsProtocol.md
 create mode 100644 studies/ArchitectureAndProposedModels/resources/.gitkeep

diff --git a/studies/ArchitectureAndProposedModels/README.md b/studies/ArchitectureAndProposedModels/README.md
new file mode 100644
index 0000000..e69de29
diff --git a/studies/ArchitectureAndProposedModels/isa.study.xlsx b/studies/ArchitectureAndProposedModels/isa.study.xlsx
new file mode 100644
index 0000000000000000000000000000000000000000..252260d95ac6aa250af6a437a7cd09d076d1dcad
GIT binary patch
literal 7510
zcmai3Wmpv45(eo`>F!vhLtzC8>F)0C4k_u*6_joy1f)Y!K)Op70g*;r8YJ&3SN$&E
zckewv_Ten^o;lCVoSE5olw{%HQDI<UkYQT2cytoM9VptcFfe=YFfe$~Z*|2S9o)<v
z+zi#coy=YJA9&f@y#dKPba7yZo&qIS*<{f5)zl@r@@fYON{(PAy+i?I;DeX#U{748
zQWmOGBRtku{E%l^-NcIZvC=a{QiAEx%6#}J^ApVWgiU7|?M3l4xyTyNoe5muis2@&
zI0!9y7lfp<t>wjy3rkCz$6_e03+Ys-J>bFq&Ic$6cR@h|oxcAZ49JYNG9}T)QjRIs
z>Ky1^H6M&WeR7WkxBt*!uG$55kyKxFwOLR1I6R30nb^~Vq_nOW7A-_b+KP1pHoVO<
zOKuD`nK<MO5u<E%rY{z4kbNWNqj@qOv%dNk?h95;j0Uq+xS%`|PCka6nQ~Zrq^nB>
zxoX{~{Fp9Bkba!+eqXMA{6a>GkffX~6C8jiBen5*Vj{;laFj9WNN2l#pOf_srPr>l
z12I%b>_LS!%R>7&7-u*7*BPDlg2#3t)Z2skK-`b#+iy(-Swpwsbv=a3S4frq?FpEl
zo`4$v_d<mJW$0pV=gR&|=GI#W+0ye{(ErKagMmT&E%V@q<eQ{Uh4&ojA`oAR)r&L-
zDdgP37OH|z3_OoJ=p0DvV)MI$zd<4sM9^cck>Z@cZdeZP3=73>$Euu=p##cD@DT<_
z!bds=udSW2SkMr|X_2^`3eeNnEw1EfWlH*6<F)Tg^&$Esp5?gL1La?`#^Dbr4qA&c
zk4|kxcyz=al~zF9&iez`LrUsLMCP06Ute)f@Y*@(I&LNI8s|}2>Rb*mnaPnyKKGu2
zWQ1w1=dvh`ZRl&vwp;>>v!upybLum!a8yeaRt9z2-O0fEC>x_OvZ4()O=M}ILA0LQ
zD5jTO6O8<0zA(Qm@^`DjK`r8C_uFKD7eKic_$SMT(7o08Vc90AWl8=j=IQ8SYvSl=
z`{O12GPKQm4s3DA73$kl9S<ED=0sQK=X$!Qa3Qtza(2>I{PtI;>ZDR+$hk<_-Q{zy
zK1l5hp9(}M%1t3v=CH_MZ^~D-LhyZgz!#R--SfIY_;^$Xy16OO9ey&N4RRMhx<q*)
zW~ZTj2^lU-NqLEg%|m(8iPsMYr}|$H8+bBlnt!L8T!7m~Q+!*4Ix(U_mBzdh@J<YT
zf@m20FlO_+ObA_8*bqBO<|8u%NqsZQgxN2evc0a0J+9~3LYQT`=vdJj%Wk3<Bh12k
z?;^jS5e$7v;xMG%?WaB<UgbHL82(T~?7jbGoASRK8uQjbKTP9lWp3_vH%z~1E4cMR
zy%9nH14HyT*`E;3ANG7PHy)QRfRDI*@fcynAJNw?SsGMeBUq|qty$WvzgrjnvYW-o
zEJMyR+n?Nm@zIvkR93XqTo`5W6~f;6ZhXbC0$<F?X}aD*inqnZJVdX3X*D3=!pk+)
z(v;_Fq_(Xg{jxvqJ9sz!I3H{2LvS6ZEbLWdKNcruSsu|&@a~Y;4idOHr8j(}TMhlf
zge-`+Wh<wGEeBAR$6Ya+ji>S1lgoFqaendQeD34p$Kmkcw3p4o6bl?}1@6a?Zg&P>
zVNCuyNErV9vJPQ((U{QA_Vk7yB-q<(()<Q{V2b^t>SuM|jqPR0rEW0(X@I1m;bFBF
z0kd_>Vv`%fMoSg~*zUm7<hYIH(4-~H<66M|K(^1Lp7R5r|8abFIhfLfgf`Rinn-sR
zEc~^<zQj5?b=IBIS8!S$sK1-_ZC9)KqOQNLEn{U-yhb1C=4GKE`veo#x=@@6y+s_W
z<C9{A*$uy?LUNVqmjlK2O%D{n?#ji6QO&Ec1Hp!CumgCh?@D%o4#D4Dr4^Rw2a*rH
zqmql2veHH;GlXIfj(iPz`pc9Ic(WUpW2Wz|vH6p*E_m<<R0t|8L6jC<9y|8;>#le|
zmD5wMrUMD6U?&RcX{AyXy@SzMz0YAXgGNC6oE*hu1ucc{IWvmMH#7o<=e#H;=V&R6
z&!0ph;4~RYnD7S$(rUTD7ttaM28Bu*0Y$EMYmQ)M;6f;jwqVPVkbxZ&aGLQ{qF7aT
zztzIR_6aORs-mDef|137A~#`EC^~!s$KU|Kz+)-Vc(biUt$4a4sT7LHW2tTT=t4;o
zVA2?^ccqgi#3_WLe}eWch4f}30EoMj_&>MGArw6m`BfAWo5}zn>dx=2UuD%1Q@JFL
z4?cu+bDK6|EMLn9e<mLp7*!}~42=5cEebHLP|_5bHkPmL!!whAXIp~hpF$CPY$Y1M
z&PJp6h6dY*M>f9KhbLFE=E$mC67N5@#WY810O0)@)vNMikuuldk6>Z>xfcH+9kAS!
zt%}f%7ai7u=f?9gV%N*2N1tT3jD|<TT$zVI_FkN+FlD*kRPPIzJ#L)pw7zmn#(Las
zq$h_|d>-IuV+BF-^tWM~uGZ<`TFbJFsK~M=QY{2l<`fLt>RqniIJbDcoFOP)F^4@|
z2wtfUe0PQQD^uJSseSSU?Eq*RrGr)|{M*;>YGv$VZl>zyV(nn*`s3wirYJk6b6|g1
zI%Nhd5;i|pn*<e@6ue4E3JjFn&bNKgne_IRebFj-6krWPv=hx1-GdB$TJqdTe@w{|
zb{(Xqw)g7P{$QDUSsWh(v0UUc^uPK>UC7DKfk$Sshyywmy5Ulnp*0<t9%8Ml2smq_
zp;yzA7spsA{~A(W#TT%)s96WE_ZnA3b11;q%j<wut|>gisxix3F`mLk;A&FD%bPQs
z<x%#wcZwrnrK;g*vH!)&r3!x!js${}yeIzZt`C*8G%xliDP*riDP=T-9v%D+wHu3v
z)9B}^1UV4}b$qzTaN36&%F{9t1X{Wt(rNrpQDmOOYAnN~f?(&DOKZx9xDeI^^AAh9
zl2e2fFy+V}Wm~O@N2dbL>#PI1cw`O)hRYKtfs)_f*2^AZ=8&0%-V<Fe=q@T*Y-Tj&
z92BQT$vHCr2yv>uYKrQjieHKCnNEJ!2rM}9#z2T7c)v*#7nYkdUYKAd@q!|ZJkl-J
zoMfjrDO%qlrjc)lIyUu*)o2Jk!@Sg3Y6>OHvr;`+)4Z*h7@;_3+Iu#UG)kzH<Ma4A
z%$WmjSmEtgUx{Y*1qJdNtB0S1^f=xMV?Oc;jytXK3?6r2K;Nl9ZgH{=7M%4V*>EPR
zv<rO@N+8Z5`M?0X#fmaPd3u_c40p?f%PUc@&)&vX7Sauw%mj+iYmvODHHIYaL|w<w
zG^~5>;6|15O(M*CdV5POIk)j272n(WWcnH_bf1Brd^)A)DgEL1`+Tk-+I76Tf&&UW
z`QgkDB}H?5s~Mey+<u&li691p2E|4}UA;;q3F_Eh2aAUM7!yfgMxz>s6XYnH$i1Gt
zrS0%&_@&?t(yxI1uPuU`mASq7?@r=2u9HT18DpTGSvS>fs|;;5ZpD6n{w!#_+KP@~
zKKvV&pl=msx(C@g4#tBXlYG`ey}M(yw37s;GCWc&k;?c)^OFzY5Z&HCfb#$w6KNPB
zXnd{4!prHI@rwXTU(?R3rPo`J8#}|K7x+W=Kb%rf8g`e>DvH^VnLa>{*0J~=Wlth-
z@G(LBONzJS7(1_=m=rz+jyaCQD{jEh&TK^<8}TwWDHVAn+Y}cZbFpp;y)V(z71Z<<
z<g8dCti+H9l)7SZ3L>^rZ=&pH%dr#tcxdp0CR)cM&gO{dScVG^C_*cxqRcvqFFO=1
zhhz)pJ{M$MM!ec%;*Z0zf5kDHkcqxm!h;bN5`x4wsUTeBzFS&{pI0ry$2hAf?Ap}S
z;iMwHc6R2(yeUwjYRi7C$#s?9Xwa&;Go5m-0(0P}Wv!iGW3Zp`kPm4fjpWlOp9UR`
zy@yxR7`|ZnYkXh#P*1+StG0$}V@we5CyaMsYmeqqf6vfL`{f;vmYu2lzNn~7kqo3(
z;5X{6JvONcIQd{5uSxgw;{mcZQSGM9fmK><d|T>AChxEUr<$aa`VwlGDX1x(DDqYl
z@Jz#F;E>-jN4*R*ZZSFCW$|SUP?(niM%8l+NLz}v%@N6*5p`{tvFNU|joGYIu^YY<
zcy@XDbsu7>n3Vy%c7J?Yw7?hD>)kXf>~}bdWk?cmbMj@Ucxii|CO+VJP_y#->`;EY
zP;{rovNB8f+4rm2fU8S2I7e5vm*H4a5uwgV*+ib&$Y@h?fmq*Lq|xt*<D=lisK9;g
z7lB=TMP2&dy~?KRf{ZH^(~I`8N+SilP>ML*JMs`zr&P{)W4)FPaKGO7weE2(;fL{u
zjV;LJJL@MiWDC*7*azap6KGYUuGv8i<3y5LB>micom#Xff!IA2E!`V(Rf#+l*!jT`
zGjwn+Ovj>NI{5Y>H~7x!D0>eGDfv_P?UpkwiLQDPwPIY=o)~^5fbwhe7|}GzZ&bva
zQWQgtl|*_{dq+g77Dr)P$eBYF+01T^VqxPgeyJm(_xtHzZ6*jtycMb*WH1QqDoNpv
z3=%pidMOe?W120^hEnt`p#aNYL^-zwf(KZ#*If!mFThGzl^$}}y~ph?JhF*woO!^9
z_2TRNx;QC$rdN5$31%;TW)5|4Z)^G^p>!_bmM*N>3;Xh#VYY%^n~C_UKrZGITb?+2
zQIwBxLKX<V9+tL|mSK6=`!vIz3^sj7%b>}k3E;LONKMA8+WMI8L_A`o&|O$WMvLAr
zH-45le$hSt`Ox^YzGcsp=iUAeV07%B^FU%MCJwhP0&$a8sp?FXw_2?HD*U?x7&AJp
zcuAXYdPa3Ve(_JM-pIc%L}<8;UkEzBSB!j039xNuiaX)Vpi5OBHdj9kD7T3^IWT8)
z)gIKF;mR&g(SE_Mam}V!N)PZ|<E8;o^bz+6(y;l$f4wrMM4VJ$sr7x&gOT$vTcI=d
zp$r(&@&Jt{I2cCWIL+nJd#xe$_)t#sdoqkQN}f7CJ)a+tTI8=CtQ0aJQ#Fx<PuOx|
zhNcZYwLP8&w-lb&@)LO9b0|H!q;p9wp(`c9J>LE<1=@`^8K0`M0xflS4R|A9GQ!b2
zzSn0oh5w954V1eC^|QZw_BrbTS5foVC<|}-jV_@#34NjIGBku?uw)cF?+L%|Sg1m_
zqf(x>?sNLu+pi;%lmNfLpXf~^@IFqfAHunugdcW7Jr_?Z!6uhvn@n{TYge>*<wD&e
z1y2m{1HBQZ!*vZ#!Zray5J}!NzF>8~IHaGg6&4=U-cjhy$2l?Y>?=hZ<)WW2rbJeF
z1VEfX>fGxVDBx_5?oe`Q%^-4;3enSXmGXMfsHveRt9kh5wO)OBJyYH!%&;QJmuqy{
z`k6FiLmX)9ycCd$g7x7lTEKnjL2U{T8fJozF~NmbxNn@g`sl#)db7oQ|3l{e&q$(i
zI%0}%N{cnzJoJp7B&1hp7&XY`@(b_rXvOQWYHAj>#_HO8c)=$wh1&2IshF{UP{_SJ
zAvjIcW{{F3{J5q@!pX>Xn7dm~Ub*5!-}3oHMq&D`EJlMqTb+J`&bKD>7I2Bsz5G5Z
zK+DGkzcb71k@eJ*t+U=9zec9I$}>Q%mJX6`iIb1BVFkY70x3M;uu5Z#zie3H%~W~)
z1${Bt8kA=6Ms<7UolnEk@xs%Zn?9a6)vCz~AB8uwOw$8dnl3gTqV3~==}eiz-6?Qv
z)}%jA18=$SehY`=z|MU8uNrq-ybuHf4LqPt8632LVf?!(bMv+{|JjWdg7l%?7=9S!
zloha~3WouR479P1QY%Sk;Z63~36w5Bha;X$l)r!oBI4q@&ljBxe5j1iK0N1c+A0uX
zAL7v~!$;D5{8a9K^QukiN@e;&y6KcwsSHc43G?GB#3eJzq+Sa6XY<?^IM`~*E3I|M
zNeM-B+|uc+`rHV#FZhylHq*$hZQfgbY?M|>7&|qkVb=7b0+{a$V@nUa>OKoi;jQx>
zL}&=l3QYv0oC+3Vd_r}aN8r}Uc@QD*RX!$L86{;v^l5Xz02Z`ax`Oa**Zm1W2mu~S
zl_M<z&i;Gj&)O>1=A>$y8rpMyIfCJXmWAzQ3m6Zb`v#Ik(d-jKRfhWE-B_9gatK*e
z$R6eLcOgD;nTs;G<V1cVEIDr4#dhD<wh^yiDc6x?*@ar6LR1J@`yQ#oE9Hl7V?>Ua
zyf2lVcoF5YPvgI4=7v+nj4&KwOntIZ?NZM;`NCu#Y4jo#B@A*>QP<4iFs3uN31EKs
z$yy>*jKVSo)(DNEJGrHDsT9palX`kUZ~lU-$7%o5c04r^$$MCCH?o@yzN)2I34Ql|
z;oJ*R!lUG<1?i%*=x3xRqB-`1p=;Y%Vb8Tal4t#=nm@Q`BM{M-S+Ray=Xtrpd_5$y
zIIssg?lY_<uPHj~%v8f}At|rHirv+nBFS3G_AHB!KGvu3aSh--wU`AQ5RBXGO4NX-
zeKOfE$J(#lPi&U<Ek1-JCYaMw4aP*14-3odM2$WQzVg56IB7bSAiBuT>o`5I72dPp
z%zSou_UPb%33+zk&Yf^={MG4ZY<q}ekP#Op8(XTJQ_q84#}Rb_y3K2IoK-Jev-7r^
zrC$O1&rO(_qp6aMqtjh#VdZMYFcivHGC=7CY-k++7BhBo`uW2m>4_tl13eUSaX;)^
zY(@oBn1}{v0jZDHTNr7uY3G0&KrID5ku&M)ydQJrk-cIoFbQiko4GXI?@Ki%?Z;&_
z&g?ZBF-KABt9op#^d&-J@u%o*@PtAHD}i(lNBnhBxjlUZ%(5t1sLI?8Xw3anb1=gb
zbH&p4P2e&H8eim=mUn`UJTQDxQoa4zoJTih0qpafN@RG1<N>_3=tUd(ABJ<~y>+ha
zbS*KQ&ZYev<<aq3M*Md%9?=(zy$G7I@)$pOvLT}&TgSKT^eV_5Ax^2kA*@+!)3g2_
zsdQN!Avp;qC+!8h{Vp^miOhY$&kMHdtPz-x>&elVB|BD)s8plnVum(W504e<S|<)W
z6S|v}<EW&wX&nH91?|(QGy$SQ*|=EfCS6u2S&^P=pW#-vOd%_&xz?S<<O9l|g2UNP
z7L~8wUcaE2mv2PqolF`{x6v2t=4u^0LpBxrL_fdjb5o7j-gB_}U0_pjW#8+>#~(P@
zzVY1&r8q0ek+9+h?w4n8Q%}Rom~I28%f3L<56)j*X6oo-{?lhFgHY-c``zLFL`Y<^
zh6HwD3H)nD7HVbq6^pO-wrO)It@Oi7Uxd>5@)#Wi1#7<e3Fwjy(wxEW6ll06iF2Ze
zII|VZkMx6mh!+z(s8xgR>vU#JE%(v1_cuy6)$f_q!A9fiEysdnn_eZ8=SefW`y$c1
zCTcP%I@Gce`&czhakN?L<|GTW71C4Tp%dGqi*XAiEvoEwPLSD+lOQYP2IAsBN)tx`
z+Tz%4RS?t;g;s;KNE=yziNWWat5oHsETu(nrcSfR;-?R71#OtrYgjPpM&T8k5W?oy
z9c?Pz3b}j|ZM{HRB8x>AW0K$Yjxoquj$nKN9nP#6=YsC5OrJ;JhBr@W2#m%lxiW`U
zP1k>W&3%w#gsFMGK#H=G`}p)cb>RGy2X`RyrNV?mFyp%4`B|+&k}s~Jg>gnmvR5XZ
zwUDgkX}0`m{&|h@N8tm<56Mr2mo^R8@6-WQAsz{Rs59K4&cM2@1Ak>WW0T|^yExFp
zpwCD7+ZX|n-F>!+TKN?&XQLJcYG-TlHc)}c&DDZOCXe%H+z$$uXLZTP>!tS}24hX6
z1zAjja5)xGt>X}`7TOiFm{1alJ3+ikV+8j;fv|R>kBfur7=yku;Mc~DuSw&2VWfqY
zxEt*iR2$SpR0uP4dK=M9!6=uRr4$W{b6anU^j!hbb=8yJm#xJj;vq#K_0W`J)wvlL
zSo>osa0l2dgy`;KJ15BW)`zR)dA9*cc0N8ao}K;bO!wsk@eTbc-yDkq&N1sj(Gozu
z+Rmy;PgA9HmczX(<HD%9ERRVU(9KUx8xdq8Z_d8QzD#;LAr(4uD3m!_aQQ4UlkpzQ
z%<D!;bz1Tzw<f8Y7+e5xC*++F_o7MT;O=y&q3pRkYcscrj|5j01e<jHB%8`)6%gE=
zn3JHlBHw;L8WR*lH-axopSdMn2FQGKPMSY*0R7bodk-Jxzt3Kv?)c04he^zz1OA>f
ze~b)n*N?Oq1$FT+n)^SM-W?(QLczdP{Kov`x&N24A58o0`k|}?8cRQU_g|X+e!Cyc
z`|bJx_<Rfa=L_Ct;(r056Dz1%sGeKcU!(T_-uDM#eY<|BXNJzYpy&Jtef{g|w<+r`
zf&B{&1Ec;o>>oP&kHv0Jxl4roIYkxu=MDZwi~Tv_|CtaTO0NATApXE^72jnl{wQYs
zE9?(%arc0`-Qu4EHlZ0Ds@L^*@AwD$|E?SfB?Et#>p!sDE8nf_e>7J87548ce;0kX
w#QZ^<LPrq4mHsL?cj0%F;ve`P;qOrSH3U+UMfmyFu%Z1&FEk>3h<}Lx3#tc~8UO$Q

literal 0
HcmV?d00001

diff --git a/studies/ArchitectureAndProposedModels/protocols/.gitkeep b/studies/ArchitectureAndProposedModels/protocols/.gitkeep
new file mode 100644
index 0000000..e69de29
diff --git a/studies/ArchitectureAndProposedModels/protocols/ArchitectureAndProposedModelsProtocol.md b/studies/ArchitectureAndProposedModels/protocols/ArchitectureAndProposedModelsProtocol.md
new file mode 100644
index 0000000..f390838
--- /dev/null
+++ b/studies/ArchitectureAndProposedModels/protocols/ArchitectureAndProposedModelsProtocol.md
@@ -0,0 +1,11 @@
+## Architecture and proposed models
+
+The model architectures were implemented using Pytorch Lightning (Falcon 2019) on top of PyTorch (Paszke et al. 2019). The model used supervised learning, a method that connects an input to an output based on example input–output pairs (Russell and Norvig 2016).
+
+The input for the model was a genomic DNA sequence. The nucleotides were encoded into four-dimensional vectors (see Supplementary Table S1). The DNA sequence of a given plant species was cut into subsequences of 21 384 bp. This number was large enough to contain typical gene lengths of plants while being divisible by ten of the numbers from one to twenty. An easily divisible subsequence length is a requirement for Predmoter (see Supplementary Section S1.2). As few chromosomes, scaffolds or contigs were divisible by 21 384 bp, sequence ends as well as short sequences were padded with the vector [0., 0., 0., 0.]. Padded base pairs were masked during training. If a subsequence only contained N bases, here referred to as “gap subsequence,” it was filtered out. Both strands, plus and minus, were used. Since the ATAC- and ChIP-seq data was PCR amplified and as such it was not possible to determine from which strand a read originated, the coverage information was always added to both strands. The model’s predictions for either ATAC-seq, ChIP-seq or both were compared to the experimental read coverage. The target data were represented per sample of experimental data. These were averaged beforehand, resulting in one coverage track per NGS dataset and plant species.
+
+Three main model architectures were examined on their performance. The first architecture consisted of convolutional layers followed by transposed convolutional layers for deconvolution (LeCun et al. 1989, LeCun and Bengio 1995). The deconvolution was added to output base-wise predictions. We refer here to this architecture as U-Net. To ensure that the new sequence lengths resulting from a convolution or deconvolution was correct, custom padding formulas were used (Supplementary Section S1.2). Our second approach was a hybrid network. A block of long short-term memory layers (LSTM) (Hochreiter and Schmidhuber 1997) was placed in between a convolutional layer block and a transposed convolutional layer block. The final approach was called bi-hybrid. Its architecture matched the hybrid architecture, except that the LSTM layers were replaced with bidirectional LSTM layers (BiLSTM) (Hochreiter and Schmidhuber 1997, Schuster and Paliwal 1997). Each convolutional and transposed convolutional layer was followed in all three approaches by the ReLU activation function (Glorot et al. 2011). Additional augmentations to the bi-hybrid network included adding batch normalization after each convolutional and transposed convolutional layer and adding a dropout layer after each BiLSTM layer except the last (Fig. 2). The Adam algorithm was used as an optimization method (Kingma and Ba 2014). The network’s base-wise predictions can be smoothed via a postprocessing step utilizing a rolling mean of a given window size.
+
+We examined 10 different model setups (Table 2). The best model of each architecture and dataset combination was used to develop the next combination test. The model reaching the highest Pearson’s correlation for the validation set was deemed the best model. Pre-tests showed that including gap subsequences, subsequences of 21 384 bp only containing Ns, led to a considerably lower Pearson’s correlation. The proportion of gap subsequences in the total data was 0.6%. Normalizing the NGS coverage data through a general approach of subtracting the average coverage from the dataset and using a ReLU transformation (Glorot et al. 2011) showed notably worse results during previous attempts. The approach of normalizing via an input sample was not feasible due to the considerable lack of available ATAC-seq input samples accompanying the experiments. Therefore, the target data was not adjusted towards its sequencing depth. For more information about the training process see Supplementary Section S1.3.
+
+All models excluded gap subsequences, subsequences of 21 384 bp only containing Ns. For more details on species selection and exact model parameters see Supplementary Table S4. Models excluding subsequences of unplaced scaffolds and non-nuclear sequences during training and testing are denoted with *.
\ No newline at end of file
diff --git a/studies/ArchitectureAndProposedModels/resources/.gitkeep b/studies/ArchitectureAndProposedModels/resources/.gitkeep
new file mode 100644
index 0000000..e69de29
-- 
GitLab