About e-Species
This is a pure Python
CGI-based implementation of a taxonomically intelligent species
search engine. It searches biological databases for a taxonomic
name. The search is done "on the fly" using web services
(SOAP/XML) or URL API's.
Classification
Synonyms and higher taxa for a taxon name are retrieved using
the Catalogue of Life
Web Service
Description
Text Snippets are fetched from Wikipedia
articles. A link to the original article is also displayed.
Keywords
Keyword extraction uses the Term Extraction service from FiveFilters.org for extracting terms from the contents of Wikipedia articles.
Genomics
Queries to NCBI are performed using the Entrez
Programming Utilities. The ESearch
tool is used to look up a taxon name and, if the name is found,
the ESummary
tool is called to get basic statistics on what NCBI holds for
that taxon. Links to external information resources for the taxon
are retrieved using the Elinktool.
Maps
Distribution maps for a taxon are retrieved from GBIF using code inspired in the Species
Distribution Widget written by Tim Robertson and Dave Martin.
Images
Wikimedia Commons is used
to find up to five images for the query term.
Documents
Documents are retrieved from PubMed. The script retrieves up to ten references for the query term.
Related Projects
Rod
Page has written the original iSpecies
taxonomically-based search engine, that also uses web services David Shorthouse has
written an iSpecies
Clone, that uses JSON.
Source Code
The e-Species search engine has been originally developed on an IBM-PC
compatible machine running Linux Ubuntu 8.04 Hardy Heron and
Python 2.5
The e-Species source code is released under the terms of the GNU General Public
License, and is available from SourceForge.
News
- Version 1.00, 29th Jun 08: Initial public release
- Version 1.01, 6th Jul 08: Added spelling suggestion from Yahoo!
Spelling Suggestion service to provide a suggested
spelling correction for a given name.
- Version 1.02, 10th Jul 08: Improved handling of synonym
status and fixed a bug in spelling suggestion.
- Version 1.03, 11th Jul 08: Added a method to class
COLSearch to check for the existence of a taxon name.
- Version 1.04, 31th Jul 08: Added automated tagging from Yahoo!
Term Extraction for Wikipedia snippet.
- Version 1.05, 1st Aug 08: Added a method to class
NCBISearch to return a list of external information
resources for search name.
- Version 1.06, 11th Aug 08 - Added a function to strip out
markup tags from Wikipedia snippet.
- Version 1.07, 05th Sep 08 - Fixed a bug in handling
Unicode characters in the author of a taxon name returned
from CoL
- Version 1.08, 09th Sep 08 - Renamed class
YahooSearchImage to YahooSearch and added functions
spellingSuggestion (renamed to spellCheck) and
termExtraction (renamed to termExtract) as new methods.
- Version 1.09, 21th Oct 08 - Removed dependency of Set
module, using tuple instead, and fixed a problem with the
display image thumbnails from Yahoo search.
- Version 1.10, 19th Mar 09 - Rewrote class
GoogleScholarSearch to removedependency of BeautifulSoup
module, using a HTMLParser instead, and included a
default value for class YahooSearch number of results
- Version 1.11, 25th Mar 09 - Improved handling of returned
references from Google Scholar and rewrote class
WikipediaSearch to make use of Dapper
- Version 1.12, 20th Jul 09 - Added some JavaScript for
client-side search form validation
- Version 1.13, 21th Jul 09 - Added stylesheet for better
form display and minor fixes
- Version 1.14, 14th Apr 10 - Adjusted for changes in CoL
webservice calls
- Version 1.15, 29th Jul 11 - Removed Yahoo search class
because of useless calls to deprecated Yahoo web services
and substituted GoogleScholarSearch for a new
GoogleSearch class to search from both Google Scholar and
Google Images
- Version 1.16, 2nd Aug 11 - Improved Google Images search
- Version 1.17, 14th Jul 12 - Fixed a bug query string
variable in class CoLSearch and updated URL to the latest
Annual Checklist version website
- Version 1.18, 20th Sep 13 - Added new routine for retrieving
images from Google and other minor improvements
- Version 1.19, 16th Aug 14 - Substituted a FiveFilters proxy
webservice for deprecated calls to Yahoo! Search
- Version 1.20 22th Sep 14 - Fixed the FiveFilters webservice URL
- Version 1.30 30th Jan 19 - Removed class GoogleSearch and
replaced it for a PubMedSearch class to get bibliographic references
from PubMed. Added a method for # fetching images
from Wikimedia Commons to class WikipediaSearch. Restored dependency to
BeautifulSoup
Todo
- Make use of synonyms in the searches, merging the results
from searches using different names, and present those
together (as suggested by Rod Page in the iPhylo
Blog).
- Allow searches using common names
Acknowledgements
Thanks to Rod Page for his implementation tips on the iSpecies Blog and
overall inspiration, to Edinaldo Nelson dos Santos-Silva and Projeto Biotupé for continuing support, to Eduardo
Dalcin for crash testing and pointing out several flaws, to Flávio Coelho and other
members of the PyScience-Brasil
discussion list for support and constructive comments, and to Douglas Soares de
Andrade for providing patches and creating an svn trunk for
the e-Species source code.
Contact
Send comments and suggestions to Mauro J. Cavalcanti