← All Tools

AlphaGenome

VERIFIED

## Connections

genomics deepmind variant-scoring dna-prediction

What It Does

11 output types from DNA sequence:

  • ATAC — chromatin accessibility (ATAC-seq)
  • CAGE — promoter activity / TSS
  • DNASE — DNase hypersensitivity
  • RNA_SEQ — gene expression levels
  • CHIP_HISTONE — histone modification marks
  • CHIP_TF — transcription factor binding
  • SPLICE_SITES — splice site predictions
  • SPLICE_SITE_USAGE — splice site usage levels
  • SPLICE_JUNCTIONS — splice junction predictions
  • CONTACT_MAPS — 3D chromatin contact maps
  • PROCAP — nascent transcription

How to Use

### Installation


        pip install alphagenome
        

### API Key

Google AI Studio key (same as Gemini): set as ALPHAGENOME_API_KEY

### Python — Score a Variant


        import os
        from alphagenome.models import dna_client, variant_scorers
        from alphagenome.data import genome
        
        dna_model = dna_client.create(os.environ['ALPHAGENOME_API_KEY'])
        
        # Define variant (GRCh38, genomic strand)
        variant = genome.Variant(
            chromosome='chr15',
            position=43600551,  # STRC E1659A (GRCh38)
            reference_bases='A',  # VEP confirmed
            alternate_bases='C',  # c.4976A>C at genomic level
        )
        
        seq_len = dna_client.SUPPORTED_SEQUENCE_LENGTHS['SEQUENCE_LENGTH_100KB']
        interval = variant.reference_interval.resize(seq_len)
        
        scores = dna_model.score_variant(
            interval=interval,
            variant=variant,
            variant_scorers=list(variant_scorers.RECOMMENDED_VARIANT_SCORERS.values()),
        )
        df = variant_scorers.tidy_scores(scores)
        

### Sequence Lengths

  • 16KB (16,384 bp) — fast, limited context
  • 100KB (131,072 bp) — good balance
  • 500KB (524,288 bp) — wide context
  • 1MB (1,048,576 bp) — full context, slower

### Organisms

  • Homo sapiens, Mus musculus

Verified Status

VERIFIED — tested 2026-04-08 with STRC E1659A. Got 11,854 variant scores across all output types and 19 scorers. API key works.

STRC Research Usage

  • Scored E1659A variant across all tissues
  • Top effects in CHIP_HISTONE (heart tissues, quantile >0.98)
  • Splice junction effects moderate (quantile ~0.99 for FRMD5 nearby)
  • STRC gene not directly annotated in results (pseudogene region complication)

Critical Limitations for STRC

  • No inner ear / cochlear tissue data — training uses ENCODE data only
  • STRC minus strand — must complement bases when converting cDNA→genomic (A↔T, C↔G)
  • Pseudogene region — STRC/STRCP1 may confuse gene-level annotations
  • Gene not found — at 100KB window, STRC not in gene annotations. May need 1MB window

Results (April 2026)

  • Splice analysis DONE: AlphaGenome SPLICE output showed no significant splice junction changes at E1659A position. Quantile scores ~0.99 for nearby genes.
  • Enhancer identification: Ensembl found 2 enhancers + 1 promoter in STRC region (see Ensembl REST API)
  • WT vs mutant attempted: dependency conflicts prevented direct comparison. Single variant scoring completed (11,854 rows).
  • Next: batch scoring of ClinVar STRC variants, mouse model prediction, contact maps

Results (April 2026)

  • Splice analysis DONE: AlphaGenome SPLICE output showed no significant splice junction changes at E1659A position. Quantile scores ~0.99 for nearby genes.
  • Enhancer identification: Ensembl found 2 enhancers + 1 promoter in STRC region (see Ensembl REST API)
  • WT vs mutant attempted: dependency conflicts prevented direct comparison. Single variant scoring completed (11,854 rows).
  • Next: batch scoring of ClinVar STRC variants, mouse model prediction, contact maps