當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

单细胞数据整合方法 | Comprehensive Integration of Single-Cell Data

發布時間：2024/7/5 编程问答 32 豆豆

生活随笔收集整理的這篇文章主要介紹了单细胞数据整合方法 | Comprehensive Integration of Single-Cell Data 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

操作代碼：https://satijalab.org/seurat/

依賴的算法

CCA

CANONICAL CORRELATION ANALYSIS | R DATA ANALYSIS EXAMPLES?

MNN

The Mutual Nearest Neighbor Method in Functional Nonparametric Regression

Comprehensive Integration of Single-Cell Data

實在是沒想到，這篇seurat的V3里面的整合方法居然發在了Cell主刊。

果然：大佬+前沿領域=無限可能

可以看到bioRxiv上是November 02, 2018發布的，然后Cell主刊June 06, 2019正式發表。

方法的創意應該在2017年底就有了，那時候我才剛來做single cell。

Single-cell transcriptomics has transformed our ability to characterize cell states, but deep biological understanding requires more than a taxonomic listing of clusters.

As new methods arise to measure distinct cellular modalities, a key analytical challenge is to integrate these datasets to better understand cellular identity and function.

Here, we develop a strategy to “anchor” diverse datasets together, enabling us to integrate single-cell measurements not only across scRNA-seq technologies, but also across different modalities.

After demonstrating improvement over existing methods for integrating scRNA-seq data, we anchor scRNA-seq experiments with scATAC-seq to explore chromatin differences in closely related interneuron subsets and project protein expression measurements onto a bone marrow atlas to characterize lymphocyte populations.

Lastly, we harmonize in situ gene expression and scRNA-seq datasets, allowing transcriptome-wide imputation of spatial gene expression patterns.

Our work presents a strategy for the assembly of harmonized references and transfer of information across datasets.

亮點1：通過錨定的方法來整合多種數據，不同平臺，不同形態。

亮點2：同時能整合scATAC-seq數據?

亮點3：空間基因表達模式分析

至今為止的單細胞重大突破：

immunophenotype (Stoeckius et al., 2017; Peterson et al., 2017),
genome sequence (Navin et al., 2011; Vitak et al., 2017),
lineage origins (Raj et al., 2018; Spanjaard et al., 2018; Alemany et al., 2018),
DNA methylation landscape (Luo et al., 2018; Kelsey et al., 2017),
chromatin accessibility (Cao et al., 2018; Lake et al., 2018; Preissl et al., 2018),
spatial positioning

單細胞數據整合的兩大問題：

how can disparate single-cell datasets, produced across individuals, technologies, and modalities be harmonized into a single reference

once a reference has been constructed, how can its data and meta-data improve the analysis of new experiments?

These questions are well suited to established fields in statistical learning.

第二個問題就類似reference assembly (Li et al., 2010) and mapping (Langmead et al., 2009) for genomic DNA sequences

identify shared subpopulations across datasets

canonical correlation analysis (CCA)
mutual nearest neighbors (MNNs)

第二種整合的問題：

only a subset of cell types are shared across datasets
significant technical variation masks shared biological signal.

這篇文章解決了三個問題：

reference assembly
transfer learning for transcriptomic, epigenomic, proteomic,
spatially resolved single-cell data

核心凝練

Through the identification of cell pairwise correspondences between single cells across datasets, termed ‘‘anchors,’’ we can transformdatasets into a shared space, even in the presence of extensive technical and/or biological differences.

This enables the construction of harmonized atlases at the tissue or organismal scale, as well as effective transfer of discrete or continuous data from a reference onto a query dataset.

一些單細胞的常識

false negatives (‘‘drop-outs’’) due to transcript abundance and protocol-specific biases

expression derived from fluorescence in situ hybridization (FISH) exhibits probe-specific noise due to sequence specificity and background binding

結果

Identifying Anchor Correspondences across Single-Cell Datasets

基本的假設：we assume that there are correspondences between datasets and that at least a subset of cells represent a shared biological state.

Constructing Integrated Atlases at the Scale of Organs and Organisms

評估不同工具在整合不同平臺和不同subtype數據的準確性

Leveraging Anchor Correspondences to Classify Cell States

開始整合case和control，cell state

Projecting Cellular States across Modalities

整合scATAC-seq

Transferring Continuous and Multimodal Data across Experiments

Predicting Protein Expression in Human Bone Marrow Cells

CITE-seq，預測蛋白表達

Spatial Mapping of Single-Cell Sequencing Data in the Mouse Cortex

小鼠大腦皮層的空間比對

what's my problem?

我也早就意識到這是個重要的有價值的問題了，但是孤軍奮戰，沒有真正的提煉這個問題，也沒有深入思考和理解，更沒有想去利用統計思維來解決這個問題。

可以看到大佬早就看到這個有價值的問題，而且已經召集人馬來討論、思考，用統計學的方法系統的提出了自己的解決方案，也最終憑借自己的實力和名氣把結果發表在最頂級的雜志上了。

是什么在阻撓我，讓我一直在原地打轉？

轉載于:https://www.cnblogs.com/leezx/p/11244731.html

總結

以上是生活随笔為你收集整理的单细胞数据整合方法 | Comprehensive Integration of Single-Cell Data的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： Ubuntu系统---NVIDIA 驱动
下一篇：获取绩效统计列表