|
|
|
|
|
|
Introduction
The problem with combining marine geological
data from years of sampling and research is that it is so diverse. The
data has been collected by geologists, engineers, surveyors and ecologists,
all with different techniques and project priorities.
The result is that a usual Relational
Database (RDBMS) type of aggregation is not very useful: the distribution
of data across the sample/parameter matrix is too sparse. Instead we advocate
a Data Mining technique which pre-processes the data before incorporation
into RDBMS and GIS (Geographic Information Systems).
Australian EEZ
Australia has an EEZ that occupies
1/12 the globe's area and now has a working database of seafloor characteristics
that is being used by engineers, ecologists, researchers, defence, policy
makers and the community at large for mapping, statistics, query and input
to models. The database holds over 120,000 attributed sample sites from
over 289 datasets and goes by the name auSEABED. A parallel structure
has been built for the US west coast in collaboration with the USGS (usSEABED)
and independently for SE Asia, all using the core software and data structure
which is bundled as dbSEABED.
The benefits of aggregating the multiple
datasets that come from present (and past) seabed sampling activities is
manifest. In the Australian and US cases we have applied the outputs to
these issues:
i) planning offshore marine parks
and |
|
|
|
|
|
|
|
|
|
|
conservation
areas
ii) tactics
planning in naval mine countermeasures
iii) sonar prediction,
both for naval systems and for whale communication distances
iv) pipeline
route planning, including the question of self-burial
v) fisheries
habitat assessments involved in regulation of activities
vi) research
cruise planning, for example for stratified sampling
vii) seabed
stability modelling, under wave, tide and current regimes
viii) input
to ocean and inshore nutrient modelling
ix) guiding
sonar searches for wrecks and seabed obstacles.
The data structure
Basically, the data - more or less
in the original terms and units of the original authors - is held in a
mineable set of Data Resource Files. Although they are in ASCII they are
not flat in structure, being more like an XML type tree-d structure. The
ASCII format is important: a vendor-independent legacy of data is created
that can be worked on scientifically by commercial- and research-type software
packages.
A wide range of data is held: lithological,
textural, compositional, geochemical, petrological, geotechnical, geoacoustic,
sedimentary structures, benthic biota. Although we concentrate on point-sample
data, polygon and polyline formats are treatable. Down-core samplings and
analyses are also handled appropriately. As time goes on, the resolution
of the database increases by the addition of more data and construction
of extra data mining |
|
|
|
|
|