Rentrez

Getting NCBI data into your R sessions

David Winter (@theatavism), Arizona State University

The NCBI

The NCBI has a lot of data... like a lot

The NCBI

The NCBI has a lot of data... like a lot

genbank_35

The NCBI

The NCBI has a lot of data... like a lot

genbank_CDROM

(thanks @yokofakun/wikipedia!)

The NCBI

The NCBI has a lot of data... like a lot

load_all("~/src/rentrez")
all_dbs <- entrez_dbs()[-47] #there's always one...
how_many_recs <- function(db)  as.integer(entrez_db_summary(db)["Count"]) 
nrecs <- sapply(all_dbs, how_many_recs)

The NCBI

dotchart(nrecs)

plot of chunk dotchart

The NCBI and entrez

Via the web...

The NCBI and entrez

..or an API

The NCBI and entrez in R

  • Lots of use the API for a few cases
  • genomes package in BioC is pretty complete...
  • is rentrez which I'm going to talk about

Demo

What's next

A stable and feature-complete release

A little help

ensembl REST in R