Note

Click here to download the full example code

Parameter tables¶

Display pandas tables of catalog meta data, detections, point estimates for their parameters, and summaries of posterior samples. This example also serves as a brief tutorial for using the lisacattools.catalog tools

import pandas as pd

from lisacattools.catalog import GWCatalogs
from lisacattools.catalog import GWCatalogType

Start by loading the main catalog files processed from GBMCMC outputs

catType = GWCatalogType.UCB  # catalog type (UCB or MBH)
catPath = "../../tutorial/data/ucb"  # path to catalog files
catName = "cat15728640_v2.h5"  # name of specific catalog file
catPattern = "/cat**.h5"  # pattern of main catalog file(s) cat[T]_v[i].h5
rejPattern = "/*chains*"  # pattern of chain files cat[T]_v[i]_chains_[j]s.h5

# create catalogs object by searching for specifically-named files
catalogs = GWCatalogs.create(
    catType, catPath, accepted_pattern=catPattern, rejected_pattern=rejPattern
)

# or create catalogs object for specific file name
catalogs_specified = GWCatalogs.create(catType, catPath, catName)

Get metadata of catalogs into DataFrame

catalogs.metadata

	Observation Time	parent	Build Time	location
aws_3mo_v3	7864320.0	aws_1p5mo_v3	2020-09-21 18:34:01	../../tutorial/data/ucb/cat7864320_v3.h5
aws_6mo_v2	15728640.0	aws_3mo_v3	2020-09-22 01:05:24	../../tutorial/data/ucb/cat15728640_v2.h5

Compare metadata to specified catalog

catalogs_specified.metadata

	Observation Time	parent	Build Time	location
aws_6mo_v2	15728640.0	aws_3mo_v3	2020-09-22 01:05:24	../../tutorial/data/ucb/cat15728640_v2.h5

GWCatalogs object¶

The GWCatalogs object can contain multiple catalogs (e.g. updated releases after more data are analyzed)

# Get list of catalogs' names
names = catalogs.get_catalogs_name()
print(names)

Out:

['aws_3mo_v3', 'aws_6mo_v2']

Select individual catelogs…

# ...by place in list (oldest)
cat_first = catalogs.get_first_catalog()

# ...by place in list (newest)
cat_last = catalogs.get_last_catalog()

# ...by name
cat_6mo = catalogs.get_catalog_by("aws_6mo_v2")

print(cat_first, cat_6mo, cat_last, sep="\n")

Out:

UcbCatalog: aws_3mo_v3 ../../tutorial/data/ucb/cat7864320_v3.h5
UcbCatalog: aws_6mo_v2 ../../tutorial/data/ucb/cat15728640_v2.h5
UcbCatalog: aws_6mo_v2 ../../tutorial/data/ucb/cat15728640_v2.h5

GWCatalog object¶

Once an individual catalog is selected, explore some of the metadata it contains

# Select the 6-months release
catalog = catalogs.get_catalog_by("aws_6mo_v2")

Get list of detections in catalog

detections_list = catalog.get_detections()
N_detections = len(detections_list)
print("\nlist of 1st 10 detections ({} total):\n".format(N_detections))
print(*detections_list[:10], sep="\n")

Out:

list of 1st 10 detections (6195 total):

LDC0081497609
LDC0081535331
LDC0081547837
LDC0081595365
LDC0081697901
LDC0081782737
LDC0081731061
LDC0081720817
LDC0081805652
LDC0081853812

Get list of attributes for each detection in catalog object

list_of_attributes = catalog.get_attr_detections()
print("list of attributes:\n")
print(*list_of_attributes, sep="\n")

Out:

list of attributes:

SNR
evidence
segment
chain file
parent
Frequency
Frequency Derivative
Amplitude
Ecliptic Longitude
coslat
cosinc
Initial Phase
Polarization
Ecliptic Latitude
Inclination

Get DataFrame of all detections, sorted by SNR

detections_df = catalog.get_dataset("detections")
detections_df.sort_values(by="SNR", ascending=False)

	SNR	evidence	segment	chain file	parent	Frequency	Frequency Derivative	Amplitude	Ecliptic Longitude	coslat	cosinc	Initial Phase	Polarization	Ecliptic Latitude	Inclination
name
LDC0092117281	969.35700	1.000000	1130	cat15728640_v2_chains_1100s.h5	LDC0092117278	0.009212	4.894927e-15	6.260013e-22	4.942600	0.083255	0.926526	0.542133	4.059177	-0.083351	-1.185071
LDC0034068418	853.52200	1.000000	417	cat15728640_v2_chains_400s.h5	LDC0034068597	0.003407	6.593026e-17	1.561400e-21	3.996123	0.190280	0.439688	2.707252	1.368857	-0.191448	-0.455252
LDC0094454855	778.45000	1.000000	1159	cat15728640_v2_chains_1100s.h5	LDC0094454889	0.009445	7.405489e-15	5.187787e-22	5.223261	0.192633	0.864661	0.591868	0.523694	-0.193845	-1.044475
LDC0061655932	579.42300	1.000000	756	cat15728640_v2_chains_700s.h5	LDC0061655797	0.006166	2.126872e-16	3.393720e-22	1.716624	-0.362513	0.968030	0.023477	1.639906	0.370963	-1.317255
LDC0072193620	553.17800	1.000000	886	cat15728640_v2_chains_800s.h5	LDC0072193586	0.007219	2.943192e-15	5.601691e-22	3.820800	-0.576127	0.835999	0.522805	3.888065	0.613982	-0.989951
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
LDC0060387159	4.83844	0.765490	741	cat15728640_v2_chains_700s.h5		0.006039	-5.360429e-16	5.122011e-24	4.846419	-0.256020	0.710923	0.371578	3.419881	0.258903	-0.790810
LDC0002826892	4.07203	0.595023	33	cat15728640_v2_chains_0s.h5		0.000283	3.155091e-20	1.097571e-21	2.344148	-0.601423	-0.647226	1.797817	4.450372	0.645281	0.703939
LDC0002395231	3.58938	0.723118	28	cat15728640_v2_chains_0s.h5		0.000240	2.579484e-20	1.296201e-21	3.101727	0.760541	0.968981	1.525498	2.810456	-0.864145	-1.321072
LDC0032655469	3.43374	0.611254	400	cat15728640_v2_chains_400s.h5		0.003266	1.786966e-16	9.726430e-24	4.475321	-0.311603	-0.362740	2.702549	2.426660	0.316880	0.371206
LDC0002576555	1.87563	0.792332	30	cat15728640_v2_chains_0s.h5	LDC0002576429	0.000258	4.778399e-20	7.391925e-22	3.791475	0.326491	0.569698	1.860063	1.893807	-0.332588	-0.606138

6195 rows × 15 columns

Individual sources¶

Select sources based on their observed properties

# catalog.get_median_source() returns pandas DataFrame of detection metadata
median_snr_source = catalog.get_median_source("SNR")
median_f0_source = catalog.get_median_source("Frequency")
median_A_source = catalog.get_median_source("Amplitude")

pd.concat([median_snr_source, median_A_source, median_f0_source]).style

	SNR	evidence	segment	chain file	Frequency	Frequency Derivative	Amplitude	Ecliptic Longitude	coslat	cosinc	Initial Phase	Polarization	Ecliptic Latitude	Inclination
name
LDC0032278111	22.607400	1.000000	395	cat15728640_v2_chains_300s.h5	0.003228	-0.000000	0.000000	4.847894	0.273976	0.869589	2.712843	4.439019	-0.277525	-1.054370
LDC0031947065	29.374100	1.000000	391	cat15728640_v2_chains_300s.h5	0.003195	-0.000000	0.000000	1.717829	-0.271190	-0.682377	0.167233	3.289386	0.274630	0.751009
LDC0042502600	14.741000	1.000000	521	cat15728640_v2_chains_500s.h5	0.004250	0.000000	0.000000	4.674222	-0.231138	-0.308204	2.205522	0.071716	0.233247	0.313304

Now pick a single source and investigate it’s parameters

# Choose median SNR source for deeper analysis
sourceID = median_snr_source.index[0]

# get list of samples attributes
sample_attr = catalog.get_attr_source_samples(sourceID)
print(
    "\nlist of posterior sample parameters for source {}:\n".format(sourceID)
)
print(*sample_attr, sep="\n")

Out:

list of posterior sample parameters for source LDC0032278111:

Frequency
Frequency Derivative
Amplitude
Ecliptic Longitude
coslat
cosinc
Initial Phase
Polarization
SNR
entry match
waveform measure
Ecliptic Latitude
Inclination

Get all posterior samples in a pandas DataFrame, dropping redundant columns

samples = catalog.get_source_samples(sourceID).drop(
    ["coslat", "cosinc"], axis=1
)
samples.describe().loc[["mean", "std", "25%", "50%", "75%"]]

	Frequency	Frequency Derivative	Amplitude	Ecliptic Longitude	Initial Phase	Polarization	SNR	entry match	waveform measure	Ecliptic Latitude	Inclination
mean	3.227813e-03	2.537236e-16	4.346904e-23	4.821552	1.331605	3.173312	22.966320	0.830781	46.475097	-0.303145	-0.511312
std	3.311320e-08	1.929476e-16	1.344294e-23	0.128246	0.903962	1.802685	2.812526	0.049952	13.995506	0.198826	0.273521
25%	3.227790e-03	8.479440e-17	3.277325e-23	4.732455	0.345616	1.659570	21.025400	0.802787	36.877950	-0.438911	-0.648070
50%	3.227811e-03	2.556485e-16	4.240005e-23	4.818054	1.616651	3.156995	22.853300	0.836049	44.887100	-0.336942	-0.434852
75%	3.227832e-03	4.175120e-16	5.249907e-23	4.899133	1.907742	4.744525	24.854100	0.866010	54.229050	-0.229548	-0.312153

It is easy to pick out a subset of parameters (i.e. marginalize over all others)

# get subset (i.e. marginalized) of samples
parameters = [
    "Frequency",
    "Amplitude",
    "Ecliptic Longitude",
    "Ecliptic Latitude",
    "Inclination",
]
marginalized_samples = catalog.get_source_samples(sourceID, parameters)
marginalized_samples.describe().loc[["mean", "std", "25%", "50%", "75%"]]

	Frequency	Amplitude	Ecliptic Longitude	Ecliptic Latitude	Inclination
mean	3.227813e-03	4.346904e-23	4.821552	-0.303145	-0.511312
std	3.311320e-08	1.344294e-23	0.128246	0.198826	0.273521
25%	3.227790e-03	3.277325e-23	4.732455	-0.438911	-0.648070
50%	3.227811e-03	4.240005e-23	4.818054	-0.336942	-0.434852
75%	3.227832e-03	5.249907e-23	4.899133	-0.229548	-0.312153

Total running time of the script: ( 0 minutes 0.867 seconds)

Gallery generated by Sphinx-Gallery