HTTP/2 301
date: Sun, 18 Jan 2026 05:57:42 GMT
content-length: 0
location: https://doi.org/10.1101/151274
server: cloudflare
vary: Origin
expires: Mon, 19 Jan 2026 05:57:42 GMT
permissions-policy: interest-cohort=(),browsing-topics=()
cf-cache-status: DYNAMIC
nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800}
strict-transport-security: max-age=31536000; includeSubDomains; preload
report-to: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=cTYf3I6%2BjuQfa2W3Dw5MmAnkXT%2FDdFQJSE3APFheqeV2Oa280n%2BFJ9iQ1KrKgMFDbVljmGhWlC2Iktk8QbAweFJSkVsd6w%3D%3D"}]}
cf-ray: 9bfbdb3c3a1fc177-BLR
alt-svc: h3=":443"; ma=86400
HTTP/2 302
date: Sun, 18 Jan 2026 05:57:42 GMT
content-type: text/html;charset=utf-8
location: https://biorxiv.org/lookup/doi/10.1101/151274
server: cloudflare
vary: Origin
vary: Accept
expires: Sun, 18 Jan 2026 06:11:00 GMT
permissions-policy: interest-cohort=(),browsing-topics=()
cf-cache-status: DYNAMIC
nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800}
strict-transport-security: max-age=31536000; includeSubDomains; preload
report-to: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=bRXDtGHTFkywqUu47D9VRGsv7R4YXNzvIR8BFUpfiupbHG%2Fp2F5jXG2W7%2BE36E6ebOn1wDjXpkFUBSBe%2BYnnJY8MzDr19g%3D%3D"}]}
cf-ray: 9bfbdb3ddb02c177-BLR
alt-svc: h3=":443"; ma=86400
HTTP/1.1 302 Found
Date: Sun, 18 Jan 2026 05:57:43 GMT
Content-Type: text/html; charset=iso-8859-1
Transfer-Encoding: chunked
Connection: keep-alive
server: cloudflare
location: https://www.biorxiv.org/lookup/doi/10.1101/151274
cf-cache-status: DYNAMIC
Nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800}
Report-To: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=ytW7jAdNPl93kCQRulvGnAx%2BOYio1F3VZDa%2FiAmtuoxNzVB0JppXh9YBve0gpEMTsW26ctastlZHxnpYikEUfZfmjmbe5MbOxSp3"}]}
CF-RAY: 9bfbdb3fba3742ed-BOM
alt-svc: h3=":443"; ma=86400
HTTP/2 301
date: Sun, 18 Jan 2026 05:57:44 GMT
content-type: text/html; charset=UTF-8
location: https://www.biorxiv.org/content/10.1101/151274v3
cf-ray: 9bfbdb42e999c1a4-BLR
x-content-type-options: nosniff
x-content-type-options: nosniff
x-drupal-cache: MISS
expires: Sun, 18 Jan 2026 06:27:44 GMT
cache-control: public, max-age=1800
x-varnish-ttl:
pragma: no-cache
vary: Accept-Encoding
x-highwire-sitecode: biorxiv
x-highwire-smart-code: biorxiv_production
x-varnish: 1891456273
x-varnish-cache:
via: 1.1 varnish
cf-cache-status: MISS
set-cookie: __cf_bm=lsemOPauSbUWrg0iTl3pBW.pZvu3fHDujzZuelI9_nE-1768715864-1.0.1.1-IYUgbuY_I_lnJpb82jByJc0FmiAxcRFUPjKhtgyS4.ri_Lbd3_FR7aum7gG.yk_60_I4_vf7G46McI7fJ5FGE7xZovgkEa9kAtG2Y9dzAeI; path=/; expires=Sun, 18-Jan-26 06:27:44 GMT; domain=.www.biorxiv.org; HttpOnly; Secure; SameSite=None
server: cloudflare
HTTP/2 200
date: Sun, 18 Jan 2026 05:57:45 GMT
content-type: text/html; charset=utf-8
content-encoding: gzip
x-content-type-options: nosniff
x-content-type-options: nosniff
x-drupal-cache: MISS
expires: Sun, 19 Nov 1978 05:00:00 GMT
cache-control: no-cache, must-revalidate
set-cookie: SSESS1dd6867f1a1b90340f573dcdef3076bc=E1EvGIL375IJ5rrm7_1qNqjNzUMQbCxejpW0CU2YXow; expires=Tue, 10-Feb-2026 09:31:04 GMT; path=/; domain=.biorxiv.org; secure; HttpOnly
content-language: en
x-frame-options: SAMEORIGIN
x-generator: Drupal 7 (https://drupal.org)
link:
; rel="canonical",; rel="shortlink"
vary: Accept-Encoding
x-highwire-sitecode: biorxiv
x-highwire-smart-code: biorxiv_production
x-varnish: 1891456302
age: 0
via: 1.1 varnish
x-varnish-ttl:
x-varnish-cache:
cf-cache-status: DYNAMIC
server: cloudflare
cf-ray: 9bfbdb468b46c1a4-BLR
FactorNet: a deep learning framework for predicting cell type specific transcription factor binding from nucleotide-resolution sequential data | bioRxiv
New Results
FactorNet: a deep learning framework for predicting cell type specific transcription factor binding from nucleotide-resolution sequential data
doi: https://doi.org/10.1101/151274

Abstract
Due to the large numbers of transcription factors (TFs) and cell types, querying binding profiles of all TF/cell type pairs is not experimentally feasible, owing to constraints in time and resources. To address this issue, we developed a convolutional-recurrent neural network model, called FactorNet, to computationally impute the missing binding data. FactorNet trains on binding data from reference cell types to make accurate predictions on testing cell types by leveraging a variety of features, including genomic sequences, genome annotations, gene expression, and single-nucleotide resolution sequential signals, such as DNase I cleavage. To the best of our knowledge, this is the first deep learning method to study the rules governing TF binding at such a fine resolution. With FactorNet, a researcher can perform a single sequencing assay, such as DNase-seq, on a cell type and computationally impute dozens of TF binding profiles. This is an integral step for reconstructing the complex networks underlying gene regulation. While neural networks can be computationally expensive to train, we introduce several novel strategies to significantly reduce the overhead. By visualizing the neural network models, we can interpret how the model predicts binding which in turn reveals additional insights into regulatory grammar. We also investigate the variables that affect cross-cell type predictive performance to explain why the model performs better on some TF/cell types than others, and offer insights to improve upon this field. Our method ranked among the top four teams in the ENCODE-DREAM in vivo Transcription Factor Binding Site Prediction Challenge.
Copyright
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY 4.0 International license.