JEeutils - Access NCBI from Java
Articles —> JEeutils - Access NCBI from Java
JeUtils is an open source command-line utility provided by Algosome to access the National Institute of Biotechnology Information (NCBI) databases. The purpose of JeUtils is to allow for automation of queries to NCBI as well as custom parsing of NCBI output to retrieve particular information (automation and custom parsing requires some knowledge of scripting or programming).
A simple use of JeUtils would be to search a database for a particular gene or keyword. The workflow consists of:
- Constructing an EntrezSearch instance
- Configuring the EntrezSearch object with the appropriate parameters (keywords, database, etc...)
- Perform the query (doQuery)
- Construct an EntrezFetch instance
- Configure the EntrezFetch object with the appropriate parameters
- Perform the query (doQuery)
The following is a code snippet example using the JeUtils API
EntrezSearch search = new EntrezSearch(); search.setDatabase(EntrezSearch.DB_NUCLEOTIDE); search.setTerm("ubiquitin"); search.setMaxRetrieval(1); search.doQuery(); try{ Thread.sleep(1000); }catch(InterruptedException e){ e.printStackTrace(); throw Exception(e); } EntrezFetch fetch = new EntrezFetch(search); fetch.setRetType("gb");//Genbank fetch.doQuery(new InputStreamParser(){ @Override public void parseFrom(int start){}//do nothing @Override public void parseTo(int end){}//do nothing @Override public void parseInput(InputStream is) throws IOException{ BufferedReader br = new BufferedReader(new InputStreamReader(is)); String line = null; while ( ( line = br.readLine() ) != null ){ System.out.println(line); } } });
The code above searches for the keyword 'ubiquitin' in the nucleotide database, allowing only 1 item to be retrieved. The implementation of InputStreamParser simply prints out the result to the command line. More complex implementations could parse the appropriate data, and/or output to a file or database.
Note that this utility is dependent upon NCBI eUtil API. The usage is thus dependent upon their restrictions (most notably, please allow for a few seconds between consecutive queries using the Thread.sleep() method, especially during high usage time (work week - US business hours)).
Comments
- Brij Brij - February, 4, 2015
How can i use Einfo to retrieve the list of databases from NCBI ? Thank you
- Greg Cope - November, 30, -0001
Please see EUtils Quick Start for information on how to list the databases
- Akash kumar - February, 19, 2016
Thankyou for your such code, you helped me alot.
- Diana Lemos - March, 2, 2016
I want to search in ClinVar but I can't find this database in the options.
- Greg Cope - March, 2, 2016
Diana, this database isn't hard-coded - presuming that database is available via eUtils you will have to specify the string name of the database as a parameter to the setDatabase method...something like setDatabase('clinvar ');
- Diana Lemos - March, 14, 2016
I'm using this example to access dbSNP but I can't find a way to access the snp details. All I can find is the summary.