|
| |
BioPortfolio web search engine Research Report March 2001
By Silico Research ( www.Silico-Research.com
)
Executive summary Product BioPortfolio search engine Date 2nd March, 2001
Background. Silico Research has
been engaged by Bioportfolio to conduct an independent analysis of
Bioportfolio’s web search engine ( http://www.bioportfolio.co.uk
). The search engine is designed to be used by pharmaceutical and
healthcare executives and scientists for sector-specific searches.
The search engine is built upon SmartLogik™ technology developed
Muscat. Muscat is a subsidiary of Bright Station, a UK-based
company.
Bioportfolio takes a different approach from other search engines.
General web sites typically index all sites submitted to them.
Bioportfolio’s search engine indexes and classifies 4,000
sector-relevant web sites selected by Bioportfolio’ staff.
Bioportfolio plans to increase the number of sites indexed to 6,000
this year.
Bioportfolio also differs from other search engines in that it uses
a pharmaceutical and scientific thesaurus to classify and index
web-pages. General search engines typically index web pages without
reference to a thesaurus or taxonomy. The taxonomy is used to
suggest words to add to the search and to allow the user to refine
his search.
Methodology. For the purposes
of our analysis we conducted four word-based searches on
Bioportfolio, Google (http://www.google.com) and Northern Light
(http://www.northernlight.com). Google and Northern Light were used
as comparisons because the evidence is that they are among the most
heavily used search engines by the scientific and pharmaceutical
communities. Search words included ‘proteome’, ‘bioinformatics’,
‘data-mining’ and ‘BLAST’.
Conclusion. We believe that the
Bioportfolio search engine is one of the most valuable tools
available to pharmaceutical executives and scientists who search
the web on a regular basis. We found that Bioportfolio typically
generated search results that are likely to be of a higher value to
biopharmaceutical executives and scientists than either Google or
Northern Light. There were fewer expired pages in the Bioportfolio
search and fewer low value pages. We classified pages as low value
where, for example, they were student-orientated, were
predominately advertising-related or were themselves directories of
other web pages.
Executive summary of findings
Bioportfolio main beneficial features
We found that Bioportfolio had a number of features that we believe
makes it a more useful search engine for pharmaceutical and
scientific users than Google or Northern Light.
-
Document
types. Bioportfolio indexes html pages, PDF files, MS
PowerPoint files and MS Word documents. Whilst it has recently
begun to index PDF files Google does not index PowerPoint files
or Word documents. Northern Light does not index PDF files or MS
Word documents. As many pharmaceutical and scientific web sites
include PowerPoint presentations or Word documents the ability
to access both is a very useful feature of Bioportfolio’s
search engine. On average 8% of the documents returned by
Bioportfolio in our test searches were in PowerPoint or Word
formats.
-
Stemming.
Bioportfolio searches for the stem of the words entered. So, for
example, entering the word ‘bioinformatic’ returns a result
based on the occurrences of the stem ‘bioinformat’. Google
and Northern Light search strictly on the word entered.
-
Searching.
Both Bioportfolio and Northern Light return a probabilistic
result based upon the occurrence of the words entered. This
means that a number of words and alternatives can be entered in
order to generate a more precise search.
-
Suggestions.
Bioportfolio suggests additional words to search for. So, for
example, a search for ‘bioinformatic’ and ‘BLAST’ will
return a number of additional terms to search on including
‘waterman’, ‘genbank’, ‘fasta’, ‘proteom’,
‘smith’ and ‘genom’. We found this a very useful feature
for refining searches.
-
Dating.
Both Bioportfolio and Northern Light return a date for the
document and allow for searches to be limited and sorted by
date. Google does not have this feature.
-
Highlighting.
Bioportfolio highlights the occurrences of the search terms in
the returned documents. This is a useful feature in long or
complex documents.
Suggestions
for improvement
We believe that the Bioportfolio service would benefit from a
number of features:
-
Accessibility.
We would recommend that Bioportfolio seriously considers
reducing the number of screens that need to be passed through in
order to access the service. The user passers through four
screens including a password screen in order to execute a
search. This is likely to act as a deterrent to many users and
so to reduce the usage of the service.
-
Site
addition.
We would
suggest that Bioportfolio considers a service allowing site
owners to submit their site for inclusion on the service. This
is standard in other search engines and would aid the task of
making the service comprehensive. We believe that there will be
a high level of participation and a low percentage of irrelevant
sites because of the specialised nature of the users of the
service.
-
Hyperlinks.
We would suggest that Bioportfolio examines whether it is
possible automatically to add sites to the database that are
hyper-linked in the indexed sites. This would significantly
increase the number of sites in the database without necessarily
reducing the quality of the sites.
Help files:
| |
|