Marine Environmental Sciences Program
The Galeta Oil Spill Project
In April, 1986 a
major oil spill polluted an area of coral reefs, mangrove forests, and
grassbeds along the Caribbean coast of the Republic of Panama. The area
affected included a biological reserve of the Smithsonian Tropical
Research Institute (STRI) where baseline biological and environmental
data had been collected for the previous 15 years. Shortly after the
spill, a grant to study the effects of the spill was received from the Minerals Management Service of the United
States Department of the Interior. Data was then collected from May of
1986 to October of 1991 and the results of the study were published as a technical report and as an executive summary of the MMS.
The final report contains the documentation of the collection
methods for the files submitted here. In some cases however, this
document contains additional documentation for particular files,
otherwise most of the documentation was taken from a late draft of the
final report. The excerpted parts are indicated with quotation marks,
the other documentation was written by Karl
Kaufmann, the data manager for the project.
The study was carried out by the Smithsonian Tropical Research
Institute under Minerals Management Service contracts 14-12-0001-30355
and 14-12-0001-30393. All of the data collected under this contract is
presented here. It is also available from the National
Oceanographic Data Center (Accession # 9400033).
The study continued in part for an additional year under another
grant, but that data is not included, nor is data collected before the
spill or data collected by the STRI ESP program (such as the urchin data
from Galeta). Most of the data on the chemistry of the oil are presented
in tables in the final report and are not available in digital format.
Organization of the data from the project
The project was divided into 8 subprojects to study the chemistry of
the oil and 7 different environments (listed below) affected by the
spill. Each sub-project was headed by a scientist-in-charge
and produced 4 to 16 different sets of data each. The data were
originally kept in Dbase files, but have been translated into comma
delimited files (explanation below).
Each sub-project has a three letter code, called the study ID (SID)
which forms the first three letters of each file pertaining to it. Each
subproject has up to 16 different sets of data, and each of these sets
is assigned a single letter which is used for the fourth letter in each
file name. Most of the files have the characters '_M' for the next two
letters, indicating that it is the main data file for that data set.
Files with three initial letters and ending with "_S" contain the
species list for that subproject.
In the following discussion, columns in a table are referred to as
"fields" and rows are referred to as "records" in accordance to the
convention used by Dbase.
Much of the data was collected on a regular schedule, monthly to
yearly. Each time a collection of data was made, corresponding to a
particular month, quarter, or year, it was assigned a collection ID
(CID), unique to that particular data set. For example, if quarterly
samples are taken, but it takes a week to do all the sampling, then all
the samples taken during that week were given the same CID. These
numbers are sequential, but may not necessarily start at 1, and some
numbers may be skipped for various reasons.
Each site was assigned a unique 4 letter acronym which is listed in
Table 1.4 of the final report. Site maps can be found in the final
report in the chapter pertaining to each subproject.
of data files
The arrangement of the data in the tables is designed to permit cross
tabulations which can be used to arrange it in any manner desired. We
avoided the use of different columns for the same kind of data, such as
a separate column for each species where each column contains a count of
the number of species. Rather, there is one column for the species name,
one column for the count, one column for the CID (or date) and one
column for the site. If it is required to make a table with the CID (or
date) across the top and species on separate lines for time series
analysis, or species across the top and the date on separate lines for
making a graph, this can be accomplished by doing a crosstabulation on
the appropriate fields. All species abbreviations and site names start
with letters, and have only letters, numbers, or underline characters (
'_' ) in them and are 8 or less characters long so that they will make
legal Dbase field names.
Missing data is indicated by a -1 for a numeric field or a blank for
a character field, unless otherwise indicated. Logical fields are all
false by default. If an animal was specifically looked for and not
found, it is entered as 0 in the appropriate place. Usually, however,
animals which were not found and censused were not written on the data
sheet and not entered in the file. Whether or not it was actually
present in the sample depends on the thoroughness and purpose of the
sampling method. However, if none of the target organisms were recorded
in a given sample, at least something is entered, even if it is a name
such as 'empty' to record the fact that that quadrat, core, root, etc,
was sampled.
Care must be taken when doing a crosstabulation to distinguish
between cells with no data and cells where the total count in the cell
is 0. For example, if a crosstabulation is done to determine the total
number of corals of a given species that were found at each of 12 sites
for 5 different years, and a cell with a count of 0 results, it must be
determined by some other method of analysis whether there were no corals
of that species found or whether that site was not sampled in that year.
Generally, a crosstab with a count of the number of samples will reveal
if a site was not sampled, since there is always at least one entry for
each sampling unit.
Species names
Names of species, higher taxa, or other categories counted as a species
are given 8 letter abbreviations (except during data entry, when they
are entered as a 1 to 3 letter code and then expanded to the longer
abbreviation by the computer). The data abbreviation (DATAABBR) is the
abbreviation for this name that is actually written on the data sheet.
Because it is not uncommon in this sort of work for the name applied to
a given organism to change as more information is gained, a second name,
the present abbreviation (PRESABBR) is used to indicate the currently
used name. The DATAABBR is never changed - it always matches what is
written on the data sheet. The PRESABBR is updated periodically using
the current name in the species list. The species lists also have the
genus and species and/or other full taxonomic description. The species
lists then, may have more than one entry for a given presabbr depending
on how many different names were used in the past. In only one case,
however, will the dataabbr and the presabbr be the same, and this case
will also have the most recent genus, species and description.
All of the data files here have been updated to the name used as
of the time of writing of the final report. In some cases where the
species are few and well known ( e.g. urchins and mangroves) this system
is not used. Many files have fields named TAX1, TAX2, GROUP etc. These
were used during analysis to classify the taxa into various groups, and
the contents changed depending on the needs of the analysis. The changes
were made by looking up the value for one of these fields in the species
list, where taxa are classified according to one or more criteria. The
criteria used can be determined by examining which taxa are grouped
under each code. It should not be assumed that a classification called
TAX1 in a data file corresponds to a classification called TAX1 in the
species list.
Format of data
The data are available in either Dbase III format or as comma delimited
ASCII files. The format of the
ASCII files are as follows:
- Each record ends with a CR/LF
- Each field is delimited with a comma
- Text fields have double quotes (") at the beginning and end of
each field.
- Numbers have no quotes, decimals are included where needed.
- Blank text fields have only two double quotes - "" - delimited
with commas.
- Blank numeric fields have only adjacent comma delimiters
- Logical fields are formatted as single letters with T for TRUE
and F for false
- Date fields are formatted as 8 integer numbers - yyyymmdd.
The following two records contain two text fields followed by an
integer field, a date field (October 8, 1987), a text field, a blank
numeric field, a blank text field, two decimal fields, and a logical
field (false).
- Executive Summary -
- Keller, B. D. and J.B.C.Jackson,
eds. 1993. Long-term assessment of the oil spill at Bahia Las Minas,
Panama, synthesis report, volume I: executive summary. OCS Study
MMS 93-0047. U.S. Department of the Interior, Minerals Management
Service, Gulf of Mexico OCS Region, New Orleans, La. 129 pp.
- Technical Report -
- Keller, B. D. and J.B.C.Jackson,
eds. 1993. Long-term assessment of the oil spill at Bahia Las Minas,
Panama, synthesis report, volume II: technical report. OCS
Study MMS 93-0048. U.S. Department of the Interior, Minerals Management
Service, Gulf of Mexico OCS Region, New Orleans, La. 1017 pp.
Extra copies of the report may be
obtained at the following:
U.S. Department of the Interior
Minerals Management Service
Gulf of Mexico OCS Region
Public Information Unit (MS 5034)
1201 Elmwood Park Boulevard
New Orleans, La. 70123-2519
Tel: (504)736-2519
This documentation was prepared by:
Karl Kaufmann
Marine ESP Data Manager
Smithsonian Tropical Research Institute
Box 2072, Balboa
Republic of Panama
March 4, 1994, edited February 6, 1998