Data Set Technical Details for the S5 and S6 datasets, 2005-2010
(For O1, 2015-2016, see here)
This page contains technical specifications of released data,
associated with a variety of different sets of information.
In many cases, information contained on this page is not needed to use released
data sets for scientific investigations.
Much of the information on this page is primarily
of interest to LIGO Scientific Collaboration members.
Many of the links here are to internal wikis and web sites which are password
protected - these are indicated by a black diamond (◆).
If you are interested in password protected information, please
contact the GWOSC team.
In many cases, we will be able to provide the requested documentation.
GWOSC data downsampling and repackaging
GWOSC builds files from standard LIGO h(t) frames using
this code for S5/S6
. We have chosen to repackage
to make it more accessible to casual users both within the LVC and outside.
- We start with the frame files
(eg, for S6, from the files in /archive/frames/S6/LDAShoftC02 on the CIT cluster).
However, frame format is unfamiliar to people outside the GW community,
and a "lightweight" frame reader is not readily available and
we don't want to have to support one.
So, we convert to HDF5, to eliminate need for a frame reader.
hdf5 is a popular format (easily readable in python, matlab, Mathematica, C, ..),
and will be readable for many years.
We also release frame files (repackaged as described below),
in case the user already has frame reading software.
- We re-sample the strain data from 16384 Hz to 4096 Hz.
Almost all LVC searches do this already, in pre-processing,
because of the increased shot noise and the dearth astrophysical source targets.
at higher frequencies.
The data quality are less well studied above 2 kHz, and
the strain calibration is valid only up to 5 kHz.
This resampling reduces the size of our data by a factor 4
(to 4 TB for S5, all three detectors),
making the downloading easier and easing disk space needs for our users.
- Advanced LIGO data are not calibrated or valid below 10 Hz or above 5 kHz, and the data sampled at 4096 Hz are not valid above 2 kHz. In most searches for astrophysical sources, data below 20 Hz are not used because the noise is too high.
- We use a python wrapping of the LAL routine used by the CBC group,
ResampleREAL8TimeSeries, which applies an acausal downsampling filter.
This resampling has been carefully studied and reviewed by the GWOSC review team.
It "leaks" into the 60 ms before and after each data block.
Assuming the data block in question has valid (passing CAT1 veto) data,
it has a tiny effect on the amplitude of the strain data for those short periods.
- Our hdf5/frame files of fixed duration (4096s) and boundaries.
This effectively eliminates the need for users to employ gw-data-find to "find" the data.
presents a user API to get the data and load it into python,
giving users access to a list of data segments.
This approach is now also adopted for aLIGO frames.
- We have Timelines
and My Sources
to aid the user in finding data (including DQ and HWinj info) from a particular time,
effectively eliminating the need for segDB queries.
From Timeline, you can
see multiple DQ and Injection flags, zoom in, and download segments.
- The DQ and HW Injections are summarized in 1Hz DQ vectors,
in both the hdf5 and frame files.
This approach is now also adopted for aLIGO frames.
This repackaged data is also on our LDG clusters (in /archive/losc) for LVC use.
Notes about the DATA flag
See the Defining the DATA Flag
S6 DQ flags
Note that during S6, the CBC group internally used modified DQ category
definitions that were not consistent with the definitions shown on these GWOSC pages.
In particular, the CBC group internally and temporarily
redefined "passing CAT3 checks" to mean
"passing CAT2 checks and not a HW injection"; the old "CAT3" was called "CAT4",
and the old "CAT4" was called "CAT5".
However, S6 publications used the unmodified definitions,
consistent with other LSC publications
and consistent with what was used for S5 DQ flag definitions.
The categories as applied on the
GWOSC site and in GWOSC data files
(described in the
S6 data release pages)
reflect the system described in S5 and in publications, not the
above-described modified DQ category definitions.
Notes about Instrumental Lines
Below we provide links (mostly internal) that document the provenance of the
information summarized in the OSC
S5 Spectral Lines
and S6 Spectral Lines
Below we provide links that document the provenance of the
information on Hardware Injections summarized in the GWOSC
data release pages.
S5 Hardware Injections
- H1 CBC Hardware Injections:
- H2 CBC Hardware Injections:
- L1 CBC Hardware Injections:
- Plots of injections with a "successful" log message:
- Plots of injections without "successful" log message:
Automatic Log Files (ASCII Text)
Annotated Log Files
S5 Burst Plots
Injections are sorted as "successful" or "failed" based on automatically
generated log messages. Cases where the recovered SNR is not as expected
are marked in the "Flag" and "Note" columns of the annotated log files.
S6 Hardware Injections
S6 Burst Injections
The ASCII lists of injections linked from the S6 Burst HW injection page
are derived from the more detailed lists shown here:
Plots of recovered SNR may be seen at these links:
- Successful Injections: H1
- Failed Injections: H1
The S6 Burst hardware injections were mostly "coherent", meaning they
simulated a signal from a particular sky position. The detailed parameters of the injections
may be found in the
S6 Burst Injection Parameter File
A small number of differences exist between the Burst injection lists provided by Timeline and
the injection log files. These are detailed
on this Wiki page used for the review of this data set.
S6 CBC Injections
S5 Data Set Review