The automation of Science
To better understand the reasons why the
identity of the genes encoding these enzymes has
remained obscure for so long, we investigated
their comparative genomics in detail (16). The
likely explanation is a combination of three com-
plicating factors: gene duplications with retention
of overlapping function, enzymes that catalyze
more than one related reaction, and existing func-
tional annotations. Adam’s systematic bioinformatic
and quantitative phenotypic analyzes were required
to unravel this web of functionality.
Use of a robot scientist enables all aspects of a
scientific investigation to be formalized in logic.
For the core organization of this formalization,
we used the ontology of scientific experiments:
EXPO (11, 12). This ontology formalizes generic
knowledge about experiments. For Adam, we
developed LABORS, a customized version of
EXPO, expressed in the description logic lan-
guage OWL-DL (17). Application of LABORS
produces experimental descriptions in the logic
programming language Datalog (18). In the course
of its investigations, Adam observed 6,657,024
optical density (OD595nm) measurements (form-
ing 26,495 growth curves). These data are held in
a MySQL relational database. Use of LABORS
resulted in a formalization of the scientific argu-
mentation involving over 10,000 different research
units (segments of experimental research). This
has a nested treelike structure, 10 levels deep, that
logically connects the experimental observations
to the experimental metadata. (Fig. 3). This struc-
ture resembles the trace of a computer program
and takes up 366 Mbytes (16). Making such
experimental structures explicit renders scien-
tific research more comprehensible, reproduc-
ible, and reusable. This paper may be considered
as simply the human-friendly summary of the
formalization.
A major motivation for the formalization of
xperimental knowledge is the expectation that
uch knowledge is more easily reused to answer
other scientific questions. To test this, we investi-
gated whether we could reuse Adam’s functional
genomic research (16). An example question
nvestigated was the relative growth rates (mmax)
n rich and defined media of the deletion strains
ompared with those of the wild type. What
was observed, in both media, was a skewed dis-
ribution, with a few deletants having a much
ower mmax than that of the wild type, but most
having a slightly higher mmax. These observations
question the common assumption that wild-type
S. cerevisiae is optimized for mmax and provide
quantitative test data for yeast systems biology
models (19).
It could be argued that the scientific knowl-
dge “discovered” by Adam is implicit in the
ormulation of the problem and is therefore not
novel. This argument that computers cannot
originate anything is known as Lady Lovelace’s
objection (20): “The Analytical Engine has no
pretensions to originate anything. It can do
whatever we know how to order it to perform”
her italics). We accept that the knowledge
utomatically generated by Adam is of a modest
kind. However, this knowledge is not trivial, and
n the case of the genes encoding 2A2OA, it
heds light on, and perhaps solves, a 50-year-old
puzzle (21).
Adam is a prototype and could be greatly
mproved. Its hardware and software are “brittle,”
o although Adam is capable of running for a few
days without human intervention, it is advisable
o have a technician nearby in case of problems.
The integration of Adam’s artificial intelligence
AI) software also needs to be enhanced so that it
works seamlessly. To extend Adam, we have de-
veloped software to enable external users to pro-
pose hypotheses and experiments, and we plan to
utomatically publish the logical descriptions of
utomated experiments. The idea is to develop a
way of enabling teams of human and robot sci-
ntists to work together. The greatest research
hallenge will be to improve the scientific in-
elligence of the software. We have shown that a
imple form of hypothesis-led discovery can be
utomated.What remain to be determined are the
imits of automation.
Related posts
Tags: analysis, biology, data, Research, results, science, work
This entry was posted on Wednesday, March 4th, 2009 at 12:22 pm and is filed under Articles. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.