The package of JRIM is available at http://page.amss.ac.cn/shihua.zhang/software.html.
JRIM is mainly based on Cicero package (Pliner et al., 2018) and JGL package (Danaher etal., 2014).
If you have any problems, comments or suggestions, please contact us at Kangning Dong (dongkangning16@mails.ucas.edu.cn) or Shihua Zhang (zsh@amss.ac.cn).
The main purpose of JRIM is to joint reconstruct cis-regulatory interaction networks of multiple cell populations using single-cell chromatin accessibility data and explore shared and specific interaction patterns.
JRIM could generate comparable cis-regulatory interaction networks of different cell populations.
JRIM could identify common and specific interactions through comparing reconstructed networks.
JRIM adopts a aggregation process for each cell population to deal with the sparsity of single-cell chromatin accessibility.
JRIM can estimate the main parameters, which control sparsity and similarity of resulting networks, though a data-driven parameter selection process.
library(JGL)
library(cicero)
library(JRIM)
In our paper, we use the pubilc single-cell ATAC-seq dataset of 13 adult mouse tissues generated by (Cusanovich et al., 2018), which contain ~100,000 cells and ~400,000 potential regulatory elements. Here, we use the single-cell ATAC-seq data of chromosome 6 of heart, kidney, liver and thymus as our toy data, which have been randomly sampled 500 cells for each tissue.
#Toy data
data(heart_data)
data(kidney_data)
data(liver_data)
data(thymus_data)
dim(heart_data)
#> [1] 108676 3
head(heart_data)
#> # A tibble: 6 x 3
#> peaks cells counts
#> <chr> <chr> <dbl>
#> 1 chr6_3000412_30007~ ATTACTCGAGTTGAATCAGTAATGATCGATAGAGGC_HeartA_6~ 1
#> 2 chr6_3000412_30007~ CGGCTATGTAGTAAGCCGCCAGTACTTGGGCTCTGA_HeartA_6~ 2
#> 3 chr6_3070003_30702~ CGGCTATGGAACTCGACTGTAATGATCGCAGGACGT_HeartA_6~ 1
#> 4 chr6_3071181_30714~ CGCTCATTAACGAATTCGGTTGGATCTTCCTATCCT_HeartA_6~ 1
#> 5 chr6_3082708_30829~ ATTACTCGGATCTTCGCATGCCAGTTGCAGGCGAAG_HeartA_6~ 2
#> 6 chr6_3083485_30837~ GAGATTCCGCTTGAAGAGCGCCTCTTATTAATCTTA_HeartA_6~ 2
total_cells = heart_data$cells[!duplicated(heart_data$cells)]
message("Number of heart cells: ",length(total_cells))
#> Number of heart cells: 500
The first step of JRIM is grouping simsilar cells per tissue to deal with sparsity of single cell data. Same as Cicero, JRIM holds data of the CellDataSet (CDS) class, which is derived from the Bioconductor ExpressionSet class.
set.seed(2019)
tissue_list = c('Heart', 'Kidney', 'Liver', 'Thymus')
df_list = list(heart_data, kidney_data, liver_data, thymus_data)
K = length(tissue_list)
cds_list = list()
for (it in 1:K){
message("-----Per tissue process for ", tissue_list[[it]])
temp_df = df_list[[it]]
temp_df = as.data.frame(temp_df)
#generate binary peak by cell matrix
atac_cds = make_atac_cds(temp_df, binarize = TRUE)
atac_cds = detectGenes(atac_cds)
atac_cds = estimateSizeFactors(atac_cds)
#reduce dimension by t-SNE
atac_cds = reduceDimension(atac_cds, max_components = 2, reduction_method ="tSNE",PCA_dim=50,norm_method='none')
tsne_coords = t(reducedDimA(atac_cds))
row.names(tsne_coords) = row.names(pData(atac_cds))
#create CDS objects
JRIM_cds = make_JRIM_cds(atac_cds, reduced_coordinates = tsne_coords)
cds_list[[it]] = JRIM_cds
}
#> -----Per tissue process for Heart
#> Sample cells randomly.
#> Overlap QC metrics:
#> Cells per bin: 50
#> Maximum shared cells bin-bin: 39
#> Mean shared cells bin-bin: 5.21932773109244
#> Median shared cells bin-bin: 0
#> Removing 890 outliers
#> -----Per tissue process for Kidney
#> Sample cells randomly.
#> Overlap QC metrics:
#> Cells per bin: 50
#> Maximum shared cells bin-bin: 39
#> Mean shared cells bin-bin: 5.29636591478697
#> Median shared cells bin-bin: 0
#> Removing 765 outliers
#> -----Per tissue process for Liver
#> Sample cells randomly.
#> Overlap QC metrics:
#> Cells per bin: 50
#> Maximum shared cells bin-bin: 39
#> Mean shared cells bin-bin: 5.08450704225352
#> Median shared cells bin-bin: 0
#> Removing 634 outliers
#> -----Per tissue process for Thymus
#> Sample cells randomly.
#> Overlap QC metrics:
#> Cells per bin: 50
#> Maximum shared cells bin-bin: 39
#> Mean shared cells bin-bin: 5.1978021978022
#> Median shared cells bin-bin: 0
#> Removing 553 outliers
JRIM could estimate the main parameters \(\lambda_1\) and \(\lambda_2\), which control sparsity and similarity of resulting networks.
#Chromosome length of mouse
data(mouse.mm9.genome)
head(mouse.mm9.genome)
#> V1 V2
#> 1 chr1 197195432
#> 2 chr2 181748087
#> 3 chr3 159599783
#> 4 chr4 155630120
#> 5 chr5 152537259
#> 6 chr6 149517037
sample_genome <- subset(mouse.mm9.genome, V1 == "chr6")
window <- 500000
#it will cost a few minutes
#sparsity_constrain, long_dist_constrain and similarity_constrain are requirements of sparsity and similarity of resulting networks.
parameters = estimate_JRIM_parameters(cds_list,weights="sample.size",sample_num=10,
genomic_coords=sample_genome,sparsity_constrain=0.1,
long_dist_constrain=0.05,similarity_constrain=0.6)
#> Finding w1...
#> ------win:211, -----62------w1:0.0496826171875
#> ------win:449, -----32------w1:0.1031494140625
#> ------win:456, -----34------w1:0.0206298828125
#> ------win:541, -----64------w1:0.0679931640625
#> ------win:84, -----34------w1:0.0455322265625
#> ------win:475, -----31------w1:0.0513916015625
#> ------win:392, -----38------w1:0.0450439453125
#> ------win:374, -----24------w1:0.0404052734375
#> ------win:28, -----21------w1:0.0689697265625
#> ------win:558, -----49------w1:0.0491943359375
#> ---Estimated w1:0.05419921875
#> Finding w2...
#> ------win:123, -----26------w2:0.276866455078125
#> ------win:458, -----23------w2:0.217650146484375
#> ------win:99, -----59------w2:0.698389892578125
#> ------win:215, -----41------w2:0.700806884765625
#> ------win:122, -----45------w2:0.507447509765625
#> ------win:577, -----43------w2:0.640382080078125
#> ------win:512, -----72------w2:0.657059326171875
#> ------win:228, -----34------w2:0.191063232421875
#> ------win:597, -----62------w2:0.565938720703125
#> ------win:484, -----118------w2:0.665760498046875
#> ---Estimated w2:0.512136474609375
lambda1 = parameters$lambda1
lambda2 = parameters$lambda2
message("Estimated lambda1 = ",lambda1)
#> Estimated lambda1 = 0.0264418219327927
message("Estimated lambda2 = ",lambda2)
#> Estimated lambda2 = 0.0392548870350664
JRIM aports joint graphical lasso model to jointly estimate cis-regulatory interaction networks of multiple tissues in local windows.
#it will cost a few minutes
local_networks <- generate_JRIM_models(cds_list, lambda1=lambda1, lambda2=lambda2,
weights="sample.size",window = 500000,verbose=TRUE,
genomic_coords = sample_genome)
#> Dealing with NO.50 window.
#> Dealing with NO.100 window.
#> Dealing with NO.150 window.
#> Dealing with NO.200 window.
#> Dealing with NO.250 window.
#> Dealing with NO.300 window.
#> Dealing with NO.350 window.
#> Dealing with NO.400 window.
#> Dealing with NO.450 window.
#> Dealing with NO.500 window.
#> Dealing with NO.550 window.
local_networks[[1]]
#> [1] "Zero or one element in range"
local_networks[[22]]
#>
#> Number of connected nodes: 37
#> Number of subnetworks in each class: 1 1 1 1
#> Number of edges in each class: 113 103 109 82
#> Number of edges shared by all classes: 15
#> L1 norm of off-diagonal elements of classes' Thetas: 23.05991 15.905 15.62043 16.66947
dim(local_networks[[22]]$theta[[1]])
#> [1] 39 39
head(local_networks[[22]]$theta[[1]])
#> chr6_5263954_5265978 chr6_5282204_5283247
#> chr6_5263954_5265978 7.712883 0.0000000
#> chr6_5282204_5283247 0.000000 1.8618390
#> chr6_5289117_5290024 0.000000 0.0000000
#> chr6_5311830_5312453 0.000000 0.0000000
#> chr6_5332528_5334546 0.000000 -0.7788795
#> chr6_5346399_5347106 0.000000 0.0000000
#> chr6_5289117_5290024 chr6_5311830_5312453
#> chr6_5263954_5265978 0.000000 0.00000
#> chr6_5282204_5283247 0.000000 0.00000
#> chr6_5289117_5290024 9.538669 0.00000
#> chr6_5311830_5312453 0.000000 3.62367
#> chr6_5332528_5334546 0.000000 0.00000
#> chr6_5346399_5347106 0.000000 0.00000
#> chr6_5332528_5334546 chr6_5346399_5347106
#> chr6_5263954_5265978 0.0000000 0.000000
#> chr6_5282204_5283247 -0.7788795 0.000000
#> chr6_5289117_5290024 0.0000000 0.000000
#> chr6_5311830_5312453 0.0000000 0.000000
#> chr6_5332528_5334546 1.4354306 0.000000
#> chr6_5346399_5347106 0.0000000 9.024217
#> chr6_5351080_5352064 chr6_5353140_5353719
#> chr6_5263954_5265978 0 0
#> chr6_5282204_5283247 0 0
#> chr6_5289117_5290024 0 0
#> chr6_5311830_5312453 0 0
#> chr6_5332528_5334546 0 0
#> chr6_5346399_5347106 0 0
#> chr6_5358420_5358841 chr6_5360640_5361749
#> chr6_5263954_5265978 0 0.0000000
#> chr6_5282204_5283247 0 0.0000000
#> chr6_5289117_5290024 0 0.0000000
#> chr6_5311830_5312453 0 -0.0157514
#> chr6_5332528_5334546 0 0.0000000
#> chr6_5346399_5347106 0 0.0000000
#> chr6_5363067_5364135 chr6_5386580_5386960
#> chr6_5263954_5265978 0 0
#> chr6_5282204_5283247 0 0
#> chr6_5289117_5290024 0 0
#> chr6_5311830_5312453 0 0
#> chr6_5332528_5334546 0 0
#> chr6_5346399_5347106 0 0
#> chr6_5391375_5391921 chr6_5392355_5393831
#> chr6_5263954_5265978 0 0.00000000
#> chr6_5282204_5283247 0 0.28004329
#> chr6_5289117_5290024 0 0.00000000
#> chr6_5311830_5312453 0 -0.09560891
#> chr6_5332528_5334546 0 0.05373979
#> chr6_5346399_5347106 0 0.00000000
#> chr6_5397431_5398172 chr6_5398485_5400608
#> chr6_5263954_5265978 0 0.0000000000
#> chr6_5282204_5283247 0 0.0000000000
#> chr6_5289117_5290024 0 -0.0004240701
#> chr6_5311830_5312453 0 0.0000000000
#> chr6_5332528_5334546 0 0.0000000000
#> chr6_5346399_5347106 0 0.0000000000
#> chr6_5400999_5401201 chr6_5411498_5411881
#> chr6_5263954_5265978 0 0
#> chr6_5282204_5283247 0 0
#> chr6_5289117_5290024 0 0
#> chr6_5311830_5312453 0 0
#> chr6_5332528_5334546 0 0
#> chr6_5346399_5347106 0 0
#> chr6_5420429_5421350 chr6_5432300_5432559
#> chr6_5263954_5265978 0 0
#> chr6_5282204_5283247 0 0
#> chr6_5289117_5290024 0 0
#> chr6_5311830_5312453 0 0
#> chr6_5332528_5334546 0 0
#> chr6_5346399_5347106 0 0
#> chr6_5434782_5435614 chr6_5436263_5437299
#> chr6_5263954_5265978 0.00000000 0
#> chr6_5282204_5283247 0.03608084 0
#> chr6_5289117_5290024 0.00000000 0
#> chr6_5311830_5312453 0.00000000 0
#> chr6_5332528_5334546 0.09668346 0
#> chr6_5346399_5347106 0.00000000 0
#> chr6_5445844_5447422 chr6_5448569_5449230
#> chr6_5263954_5265978 0.10980815 0.0000000
#> chr6_5282204_5283247 -0.12122034 0.1459228
#> chr6_5289117_5290024 0.00000000 0.0000000
#> chr6_5311830_5312453 -0.03984976 0.0000000
#> chr6_5332528_5334546 -0.02190746 0.0000000
#> chr6_5346399_5347106 0.00000000 0.0000000
#> chr6_5457774_5458178 chr6_5459032_5460453
#> chr6_5263954_5265978 0 -0.01673183
#> chr6_5282204_5283247 0 0.00000000
#> chr6_5289117_5290024 0 0.00000000
#> chr6_5311830_5312453 0 0.04286780
#> chr6_5332528_5334546 0 -0.02701025
#> chr6_5346399_5347106 0 -0.24249820
#> chr6_5500178_5500405 chr6_5539334_5540199
#> chr6_5263954_5265978 0 0.0000000
#> chr6_5282204_5283247 0 0.0000000
#> chr6_5289117_5290024 0 0.0000000
#> chr6_5311830_5312453 0 -0.2124658
#> chr6_5332528_5334546 0 0.0000000
#> chr6_5346399_5347106 0 0.0000000
#> chr6_5582696_5584647 chr6_5589560_5590365
#> chr6_5263954_5265978 0.00000000 0.0000000
#> chr6_5282204_5283247 0.00000000 0.0000000
#> chr6_5289117_5290024 0.00000000 0.0000000
#> chr6_5311830_5312453 -0.04375603 0.0000000
#> chr6_5332528_5334546 0.00000000 0.1249342
#> chr6_5346399_5347106 0.00000000 0.0000000
#> chr6_5595082_5595833 chr6_5596078_5596286
#> chr6_5263954_5265978 0 0
#> chr6_5282204_5283247 0 0
#> chr6_5289117_5290024 0 0
#> chr6_5311830_5312453 0 0
#> chr6_5332528_5334546 0 0
#> chr6_5346399_5347106 0 0
#> chr6_5604644_5605005 chr6_5613350_5614014
#> chr6_5263954_5265978 0 0
#> chr6_5282204_5283247 0 0
#> chr6_5289117_5290024 0 0
#> chr6_5311830_5312453 0 0
#> chr6_5332528_5334546 0 0
#> chr6_5346399_5347106 0 0
#> chr6_5675385_5677068 chr6_5689801_5690050
#> chr6_5263954_5265978 0.0000000 0
#> chr6_5282204_5283247 0.0000000 0
#> chr6_5289117_5290024 0.0000000 0
#> chr6_5311830_5312453 0.0000000 0
#> chr6_5332528_5334546 -0.1724417 0
#> chr6_5346399_5347106 0.0000000 0
#> chr6_5700102_5700641 chr6_5701814_5702830
#> chr6_5263954_5265978 0 0
#> chr6_5282204_5283247 0 0
#> chr6_5289117_5290024 0 0
#> chr6_5311830_5312453 0 0
#> chr6_5332528_5334546 0 0
#> chr6_5346399_5347106 0 0
#> chr6_5720284_5720703
#> chr6_5263954_5265978 0.00000000
#> chr6_5282204_5283247 0.00000000
#> chr6_5289117_5290024 0.00000000
#> chr6_5311830_5312453 0.03317775
#> chr6_5332528_5334546 0.04735615
#> chr6_5346399_5347106 0.00000000
JRIM assembles local interaction maps to generate genome-wide cis-regulatory interaction networks.
all_cons <- assemble_connections_JRIM(local_networks,K=4,verbose=TRUE)
#> [1] "Successful JGL cicero models: 586"
#> [1] "Other models: "
#>
#> Zero or one element in range
#> 13
#> [1] "Models with errors: 0"
dim(all_cons[[1]])
#> [1] 73144 3
head(all_cons[[1]])
#> Peak1 Peak2 coaccess
#> 118 chr6_3230226_3230576 chr6_3447965_3449030 -0.050466194
#> 170 chr6_3290027_3290443 chr6_3290601_3291724 0.002641269
#> 208 chr6_3290601_3291724 chr6_3290027_3290443 0.002641269
#> 227 chr6_3290601_3291724 chr6_3447965_3449030 -0.031529016
#> 250 chr6_3293877_3294683 chr6_3295202_3295779 0.002839968
#> 266 chr6_3293877_3294683 chr6_3447965_3449030 -0.003402168
Visualizing network of single tissue can use the plot_connections function of Cicero package.
#plot single network
data("mm9_gene_annotation_chr6")
cicero::plot_connections(all_cons[[1]], chr="chr6", minbp=142000000,maxbp=142750000,gene_model = mm9_gene_annotation_chr6, collapseTranscripts = "longest")
message("Plotting network of ",tissue_list[1])
#> Plotting network of Heart
plot_multiple_connections is a modification function of plot_connections function. It can campare multiple networks using the Gviz framework.
#plot multiple networks
plot_multiple_connections(all_cons, K=4, chr="chr6", minbp=142000000,maxbp=142750000,coaccess_cutoff=c(0.1,0.1,0.1,0.1),fontsize=6,gene_model = mm9_gene_annotation_chr6,collapseTranscripts = "longest",gene_model_size=0.8)
plot_multiple_connections could show the interactions related to a peak using the viewpoint parameter. Take the promoter of Gys2 gene (peak: ‘chr6_142421080_142422257’) as an example.
#plot multiple networks related to viewpoint
plot_multiple_connections(all_cons, K=4, chr="chr6", minbp=142000000,maxbp=142750000,coaccess_cutoff=0,fontsize=6,gene_model = mm9_gene_annotation_chr6,collapseTranscripts = "longest",gene_model_size=0.8,viewpoint = 'chr6_142421080_142422257')
#> Warning in plot_multiple_connections(all_cons, K = 4, chr = "chr6", minbp =
#> 1.42e+08, : No connections in viewpoint in tissue 1
#> Warning in plot_multiple_connections(all_cons, K = 4, chr = "chr6", minbp =
#> 1.42e+08, : No connections above coaccess_cutoff in tissue 1
#> Warning in max(sub[[i]]$height, na.rm = TRUE): max里所有的参数都不存在;回
#> 覆-Inf
#> Warning in plot_multiple_connections(all_cons, K = 4, chr = "chr6", minbp =
#> 1.42e+08, : connection_ymax calc failed in tissue 1
#> Warning in plot_multiple_connections(all_cons, K = 4, chr = "chr6", minbp =
#> 1.42e+08, : No connections in viewpoint in tissue 4
#> Warning in plot_multiple_connections(all_cons, K = 4, chr = "chr6", minbp =
#> 1.42e+08, : No connections above coaccess_cutoff in tissue 4
#> Warning in max(sub[[i]]$height, na.rm = TRUE): max里所有的参数都不存在;回
#> 覆-Inf
#> Warning in plot_multiple_connections(all_cons, K = 4, chr = "chr6", minbp =
#> 1.42e+08, : connection_ymax calc failed in tissue 4
#> Warning in plotBedpe(sub[[plot_it]], chrom = chr, chromstart = minbp,
#> chromend = maxbp, : Nothing to plot
#> Warning in plotBedpe(sub[[plot_it]], chrom = chr, chromstart = minbp,
#> chromend = maxbp, : Nothing to plot
message("Plotting tissues from top to bottom are ")
#> Plotting tissues from top to bottom are
for (i in 1:4){
message(tissue_list[i])
}
#> Heart
#> Kidney
#> Liver
#> Thymus
We can see that JRIM identify the liver-specific interaction related to Gys2 promoter and the long-range kidney-specific enhancer-promoter interaction.
Pliner, H.A., Packer, J.S., McFaline-Figueroa, J.L., Cusanovich, D.A., Daza, R.M., Aghamirzaie, D., Srivatsan, S., Qiu, X., Jackson, D. and Minkina, A. (2018) Cicero predicts cis-regulatory DNA Interactions from single-cell chromatin accessibility data. Mol. Cell, 71, 858-871.
Danaher, P., Wang, P. and Witten, D.M. (2014) The joint graphical lasso for inverse covariance estimation across multiple classes. J. R. Stat. Soc. Ser. B-Stat. Methodol., 76, 373-397.
Cusanovich, D.A., Hill, A.J., Aghamirzaie, D., Daza, R.M., Pliner, H.A., Berletch, J.B., Filippova, G.N., Huang, X., Christiansen, L. and DeWitt, W.S. (2018) A single-cell atlas of in vivo mammalian chromatin accessibility. Cell, 174, 1309-1324.