This section provides an overview of the ChIP-seq files used in the {Codebook, 2024} paper. Chromatin immunoprecipitation followed by sequencing (ChIP-seq) assays were conducted for 393 transcription factors (TFs) during the course of this project. The data generated from these assays is available here. To learn more about how to work with each data type and access the files, please download the "ChIP_Metadata.xlsx." Within this directory, you will discover: 1. Reads: Trimmed sequencing reads from all ChIP-seq experiments conducted in this project. The samples underwent sequencing in different batches at varying times, and the reads are organized into sub-folders based on their sequencing batch. Depending on the sequencing method, reads are either single-end or paired-end. For additional information, consult the "ChIP_Metadata.xlsx." 2. Genome Coverage Reads: Genome coverage reads in "BigWig" format. The reads for each experiment were mapped to the genome using Bowtie2, and the coverage maps were generated using deepTools bamCoverage. For further details, refer to the Methods section in the paper. 3. Summits (Peaks): Peak files in "BED" format, displaying the binding sites resulting from each experiment. These files represent the summit of the peaks identified by MACS2, indicating the most likely binding site of a TF. Refer to the Methods section in the paper for a comprehensive explanation. Each BED file comprises 5 columns, corresponding to: chromosome, start, end, peak_name, and peak_score. Note: some of the experiments are designated as "INPUTs," signifying that the library was sequenced without any immunoprecipitation, serving as a control. These data were used as the background set for peak calling and do not have their own peaks. 4. Narrow Peaks: These are identical to the summit files but retain a wider region for each peak, corresponding to the peak width. Each narrowPeak file consists of 10 columns, encompassing: chromosome, start, end, peak_name, peak_score, strand, signalValue, pValue, qValue, and peak_summit. 5. Merged Peaks: For each TF, we combined all the MACS peaks from experiments that exhibited good quality control or a high level of overlap between replicates to get a single peak file. For a more comprehensive explanation, refer to the Methods section in the paper. Files are stored in "narrowPeaks" format. Note that the last column (peak_summit) is the absolute location of the summit, not relative. Please note that three major groups collaborated on the data processing aspects of this project: - Toronto group (Hughes lab, University of Toronto) - McGill group (Najafabadi lab, McGill University) - Moscow group (Kulakovskiy lab, also recognized as the "GRECO" team). You can find the data processed by each group in this directory.