Parameter tables

Display pandas tables of catalog meta data, detections, point estimates for their parameters, and summaries of posterior samples. This example also serves as a brief tutorial for using the lisacattools.catalog tools

import pandas as pd

from lisacattools.catalog import GWCatalogs
from lisacattools.catalog import GWCatalogType

Start by loading the main catalog files processed from GBMCMC outputs

catType = GWCatalogType.UCB  # catalog type (UCB or MBH)
catPath = "../../tutorial/data/ucb"  # path to catalog files
catName = "cat15728640_v2.h5"  # name of specific catalog file
catPattern = "/cat**.h5"  # pattern of main catalog file(s) cat[T]_v[i].h5
rejPattern = "/*chains*"  # pattern of chain files cat[T]_v[i]_chains_[j]s.h5

# create catalogs object by searching for specifically-named files
catalogs = GWCatalogs.create(
    catType, catPath, accepted_pattern=catPattern, rejected_pattern=rejPattern
)

# or create catalogs object for specific file name
catalogs_specified = GWCatalogs.create(catType, catPath, catName)

Get metadata of catalogs into DataFrame

catalogs.metadata
Observation Time parent Build Time location
aws_3mo_v3 7864320.0 aws_1p5mo_v3 2020-09-21 18:34:01 ../../tutorial/data/ucb/cat7864320_v3.h5
aws_6mo_v2 15728640.0 aws_3mo_v3 2020-09-22 01:05:24 ../../tutorial/data/ucb/cat15728640_v2.h5


Compare metadata to specified catalog

catalogs_specified.metadata
Observation Time parent Build Time location
aws_6mo_v2 15728640.0 aws_3mo_v3 2020-09-22 01:05:24 ../../tutorial/data/ucb/cat15728640_v2.h5


GWCatalogs object

The GWCatalogs object can contain multiple catalogs (e.g. updated releases after more data are analyzed)

# Get list of catalogs' names
names = catalogs.get_catalogs_name()
print(names)

Out:

['aws_3mo_v3', 'aws_6mo_v2']

Select individual catelogs…

# ...by place in list (oldest)
cat_first = catalogs.get_first_catalog()

# ...by place in list (newest)
cat_last = catalogs.get_last_catalog()

# ...by name
cat_6mo = catalogs.get_catalog_by("aws_6mo_v2")

print(cat_first, cat_6mo, cat_last, sep="\n")

Out:

UcbCatalog: aws_3mo_v3 ../../tutorial/data/ucb/cat7864320_v3.h5
UcbCatalog: aws_6mo_v2 ../../tutorial/data/ucb/cat15728640_v2.h5
UcbCatalog: aws_6mo_v2 ../../tutorial/data/ucb/cat15728640_v2.h5

GWCatalog object

Once an individual catalog is selected, explore some of the metadata it contains

# Select the 6-months release
catalog = catalogs.get_catalog_by("aws_6mo_v2")

Get list of detections in catalog

detections_list = catalog.get_detections()
N_detections = len(detections_list)
print("\nlist of 1st 10 detections ({} total):\n".format(N_detections))
print(*detections_list[:10], sep="\n")

Out:

list of 1st 10 detections (6195 total):

LDC0081497609
LDC0081535331
LDC0081547837
LDC0081595365
LDC0081697901
LDC0081782737
LDC0081731061
LDC0081720817
LDC0081805652
LDC0081853812

Get list of attributes for each detection in catalog object

list_of_attributes = catalog.get_attr_detections()
print("list of attributes:\n")
print(*list_of_attributes, sep="\n")

Out:

list of attributes:

SNR
evidence
segment
chain file
parent
Frequency
Frequency Derivative
Amplitude
Ecliptic Longitude
coslat
cosinc
Initial Phase
Polarization
Ecliptic Latitude
Inclination

Get DataFrame of all detections, sorted by SNR

detections_df = catalog.get_dataset("detections")
detections_df.sort_values(by="SNR", ascending=False)
SNR evidence segment chain file parent Frequency Frequency Derivative Amplitude Ecliptic Longitude coslat cosinc Initial Phase Polarization Ecliptic Latitude Inclination
name
LDC0092117281 969.35700 1.000000 1130 cat15728640_v2_chains_1100s.h5 LDC0092117278 0.009212 4.894927e-15 6.260013e-22 4.942600 0.083255 0.926526 0.542133 4.059177 -0.083351 -1.185071
LDC0034068418 853.52200 1.000000 417 cat15728640_v2_chains_400s.h5 LDC0034068597 0.003407 6.593026e-17 1.561400e-21 3.996123 0.190280 0.439688 2.707252 1.368857 -0.191448 -0.455252
LDC0094454855 778.45000 1.000000 1159 cat15728640_v2_chains_1100s.h5 LDC0094454889 0.009445 7.405489e-15 5.187787e-22 5.223261 0.192633 0.864661 0.591868 0.523694 -0.193845 -1.044475
LDC0061655932 579.42300 1.000000 756 cat15728640_v2_chains_700s.h5 LDC0061655797 0.006166 2.126872e-16 3.393720e-22 1.716624 -0.362513 0.968030 0.023477 1.639906 0.370963 -1.317255
LDC0072193620 553.17800 1.000000 886 cat15728640_v2_chains_800s.h5 LDC0072193586 0.007219 2.943192e-15 5.601691e-22 3.820800 -0.576127 0.835999 0.522805 3.888065 0.613982 -0.989951
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
LDC0060387159 4.83844 0.765490 741 cat15728640_v2_chains_700s.h5 0.006039 -5.360429e-16 5.122011e-24 4.846419 -0.256020 0.710923 0.371578 3.419881 0.258903 -0.790810
LDC0002826892 4.07203 0.595023 33 cat15728640_v2_chains_0s.h5 0.000283 3.155091e-20 1.097571e-21 2.344148 -0.601423 -0.647226 1.797817 4.450372 0.645281 0.703939
LDC0002395231 3.58938 0.723118 28 cat15728640_v2_chains_0s.h5 0.000240 2.579484e-20 1.296201e-21 3.101727 0.760541 0.968981 1.525498 2.810456 -0.864145 -1.321072
LDC0032655469 3.43374 0.611254 400 cat15728640_v2_chains_400s.h5 0.003266 1.786966e-16 9.726430e-24 4.475321 -0.311603 -0.362740 2.702549 2.426660 0.316880 0.371206
LDC0002576555 1.87563 0.792332 30 cat15728640_v2_chains_0s.h5 LDC0002576429 0.000258 4.778399e-20 7.391925e-22 3.791475 0.326491 0.569698 1.860063 1.893807 -0.332588 -0.606138

6195 rows × 15 columns



Individual sources

Select sources based on their observed properties

# catalog.get_median_source() returns pandas DataFrame of detection metadata
median_snr_source = catalog.get_median_source("SNR")
median_f0_source = catalog.get_median_source("Frequency")
median_A_source = catalog.get_median_source("Amplitude")

pd.concat([median_snr_source, median_A_source, median_f0_source]).style
  SNR evidence segment chain file parent Frequency Frequency Derivative Amplitude Ecliptic Longitude coslat cosinc Initial Phase Polarization Ecliptic Latitude Inclination
name                              
LDC0032278111 22.607400 1.000000 395 cat15728640_v2_chains_300s.h5 0.003228 -0.000000 0.000000 4.847894 0.273976 0.869589 2.712843 4.439019 -0.277525 -1.054370
LDC0031947065 29.374100 1.000000 391 cat15728640_v2_chains_300s.h5 0.003195 -0.000000 0.000000 1.717829 -0.271190 -0.682377 0.167233 3.289386 0.274630 0.751009
LDC0042502600 14.741000 1.000000 521 cat15728640_v2_chains_500s.h5 0.004250 0.000000 0.000000 4.674222 -0.231138 -0.308204 2.205522 0.071716 0.233247 0.313304


Now pick a single source and investigate it’s parameters

# Choose median SNR source for deeper analysis
sourceID = median_snr_source.index[0]

# get list of samples attributes
sample_attr = catalog.get_attr_source_samples(sourceID)
print(
    "\nlist of posterior sample parameters for source {}:\n".format(sourceID)
)
print(*sample_attr, sep="\n")

Out:

list of posterior sample parameters for source LDC0032278111:

Frequency
Frequency Derivative
Amplitude
Ecliptic Longitude
coslat
cosinc
Initial Phase
Polarization
SNR
entry match
waveform measure
Ecliptic Latitude
Inclination

Get all posterior samples in a pandas DataFrame, dropping redundant columns

samples = catalog.get_source_samples(sourceID).drop(
    ["coslat", "cosinc"], axis=1
)
samples.describe().loc[["mean", "std", "25%", "50%", "75%"]]
Frequency Frequency Derivative Amplitude Ecliptic Longitude Initial Phase Polarization SNR entry match waveform measure Ecliptic Latitude Inclination
mean 3.227813e-03 2.537236e-16 4.346904e-23 4.821552 1.331605 3.173312 22.966320 0.830781 46.475097 -0.303145 -0.511312
std 3.311320e-08 1.929476e-16 1.344294e-23 0.128246 0.903962 1.802685 2.812526 0.049952 13.995506 0.198826 0.273521
25% 3.227790e-03 8.479440e-17 3.277325e-23 4.732455 0.345616 1.659570 21.025400 0.802787 36.877950 -0.438911 -0.648070
50% 3.227811e-03 2.556485e-16 4.240005e-23 4.818054 1.616651 3.156995 22.853300 0.836049 44.887100 -0.336942 -0.434852
75% 3.227832e-03 4.175120e-16 5.249907e-23 4.899133 1.907742 4.744525 24.854100 0.866010 54.229050 -0.229548 -0.312153


It is easy to pick out a subset of parameters (i.e. marginalize over all others)

# get subset (i.e. marginalized) of samples
parameters = [
    "Frequency",
    "Amplitude",
    "Ecliptic Longitude",
    "Ecliptic Latitude",
    "Inclination",
]
marginalized_samples = catalog.get_source_samples(sourceID, parameters)
marginalized_samples.describe().loc[["mean", "std", "25%", "50%", "75%"]]
Frequency Amplitude Ecliptic Longitude Ecliptic Latitude Inclination
mean 3.227813e-03 4.346904e-23 4.821552 -0.303145 -0.511312
std 3.311320e-08 1.344294e-23 0.128246 0.198826 0.273521
25% 3.227790e-03 3.277325e-23 4.732455 -0.438911 -0.648070
50% 3.227811e-03 4.240005e-23 4.818054 -0.336942 -0.434852
75% 3.227832e-03 5.249907e-23 4.899133 -0.229548 -0.312153


Total running time of the script: ( 0 minutes 0.867 seconds)

Gallery generated by Sphinx-Gallery