JEeutils - Access NCBI from Java

Articles —> JEeutils - Access NCBI from Java

JeUtils is an open source command-line utility provided by Algosome to access the National Institute of Biotechnology Information (NCBI) databases. The purpose of JeUtils is to allow for automation of queries to NCBI as well as custom parsing of NCBI output to retrieve particular information (automation and custom parsing requires some knowledge of scripting or programming).

A simple use of JeUtils would be to search a database for a particular gene or keyword. The workflow consists of:

  • Constructing an EntrezSearch instance
  • Configuring the EntrezSearch object with the appropriate parameters (keywords, database, etc...)
  • Perform the query (doQuery)
  • Construct an EntrezFetch instance
  • Configure the EntrezFetch object with the appropriate parameters
  • Perform the query (doQuery)

The following is a code snippet example using the JeUtils API


EntrezSearch search = new EntrezSearch();

search.setDatabase(EntrezSearch.DB_NUCLEOTIDE);

search.setTerm("ubiquitin");

search.setMaxRetrieval(1);

search.doQuery();

try{

    Thread.sleep(1000);

}catch(InterruptedException e){

    e.printStackTrace();

    throw Exception(e);

}

EntrezFetch fetch = new EntrezFetch(search);

fetch.setRetType("gb");//Genbank

fetch.doQuery(new InputStreamParser(){

    @Override

    public void parseFrom(int start){}//do nothing

    @Override

    public void parseTo(int end){}//do nothing

    @Override

    public void parseInput(InputStream is) throws IOException{

    	   BufferedReader br = new BufferedReader(new InputStreamReader(is));

	   String line = null;

	   while ( ( line = br.readLine() ) != null ){

	   	 System.out.println(line);

	   }						

    }

});

The code above searches for the keyword 'ubiquitin' in the nucleotide database, allowing only 1 item to be retrieved. The implementation of InputStreamParser simply prints out the result to the command line. More complex implementations could parse the appropriate data, and/or output to a file or database.

Note that this utility is dependent upon NCBI eUtil API. The usage is thus dependent upon their restrictions (most notably, please allow for a few seconds between consecutive queries using the Thread.sleep() method, especially during high usage time (work week - US business hours)).




Comments

  • Brij Brij   -   February, 4, 2015

    How can i use Einfo to retrieve the list of databases from NCBI ? Thank you

  • Greg Cope   -   November, 30, -0001

    Please see EUtils Quick Start for information on how to list the databases

  • Akash kumar   -   February, 19, 2016

    Thankyou for your such code, you helped me alot.

  • Diana Lemos   -   March, 2, 2016

    I want to search in ClinVar but I can't find this database in the options.

  • Greg Cope   -   March, 2, 2016

    Diana, this database isn't hard-coded - presuming that database is available via eUtils you will have to specify the string name of the database as a parameter to the setDatabase method...something like setDatabase('clinvar ');

  • Diana Lemos   -   March, 14, 2016

    I'm using this example to access dbSNP but I can't find a way to access the snp details. All I can find is the summary.

Back to Articles


© 2008-2017 Greg Cope