-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathproject.tex
More file actions
63 lines (32 loc) · 5.86 KB
/
project.tex
File metadata and controls
63 lines (32 loc) · 5.86 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
\documentclass[a4paper]{article}
%% Language and font encodings
\usepackage[english]{babel}
\usepackage[utf8x]{inputenc}
\usepackage[T1]{fontenc}
%% Sets page size and margins
\usepackage[a4paper,top=3cm,bottom=2cm,left=3cm,right=3cm,marginparwidth=1.75cm]{geometry}
%% Useful packages
\usepackage{amsmath}
\usepackage{graphicx}
\usepackage[colorinlistoftodos]{todonotes}
\usepackage[colorlinks=true, allcolors=blue]{hyperref}
\title{Using machine learning to identify subclasses of known glitches in LIGO strain data}
\author{Melissa Kohl}
\date{May 15, 2018}
\begin{document}
\maketitle
\section{Introduction}
In the Advanced LIGO observation runs, detection of gravitational waves has occurred for multiple black hole mergers and one neutron star collision \cite{Zevin:2016}. Although the sensitivity of the detectors is enough to detect gravitational waves from these huge astronomical events, gravitational waves are also created from other astronomical events. To detect those gravitational waves, the sensitivity of the detectors needs to be better. The strain data from both the science and observation runs of the Advanced LIGO detectors contain "glitches" that can mimic gravitational waves and obscure real gravitational waves \cite{Zevin:2016}. If these glitches were eliminated, the sensitivity of the detectors would improve, making detection of gravitational waves from astronomical events other than black hole and neutron star mergers (including smaller black hole or neutron merger events) possible.
\section{Classification of Glitches}
To eliminate glitches, they must first be identified. The easiest way to identify a glitch is by looking at the same time frame at another detector that is in a different geographic location. Although the glitches can be eliminated from the data, they then must be classified so the source of the glitch can be identified and eliminated for future observing runs \cite{Mukherjee:2010}.
A group of scientists created a machine learning software package called GravitySpy to aid in classifying the monstrous amounts of glitches \cite{Zevin:2016}. Unlike previous machine learning techniques used on LIGO data that compared the waveforms of the glitches \cite{Mukherjee:2010}, GravitySpy's neural network uses spectrograms from four different time frames to create a multi-layer network that utilizes image classification techniques. Since different types of glitches have different durations, the multiple views not only provide complementary data across time frames, but also allow for identification of a broader group of glitches of different durations \cite{Bahaadini:2017}.
GravitySpy is great at classifying glitches into known classifications, but has trouble identifying new classifications \cite{Zevin:2016}. The output function in GravitySpy's neural network is \textit{softmax} \cite{Bahaadini:2017}, which essentially just classifies the input glitch into the classification with the highest correlation, regardless of how high that correlation is. In typical LIGO Collaboration-style, the task of designating new classifications is therefore largely reliant on volunteers. To date, volunteers have identified two new glitch classifications \cite{Zevin:2016}. Although human sorting is still quite inefficient, even with a large amount of people, GravitySpy is overall largely successful in classifying glitches.
\section{Identification of Sources of Glitches}
Classification of the glitches themselves is only one half of the problem with glitches in the strain data. Once glitches are classified, the source must be identified so the glitches can be eliminated from future observing runs. A fair amount of glitches have been eliminated by locating the source of the glitch, but many glitches, including the most frequent glitch, referred to as a "blip," still have unidentified sources.
Each LIGO detector has hundreds of channels of information keeping track of environmental and instrumental data that could contain the sources of glitches. LIGO collaborators have used machine learning to match the waveforms of the glitches in the strain data to waveforms in the individual channels \cite{Zevin:2016}, but there are too many channels for this technique to efficiently find glitch sources.
\section{Project Idea}
One avenue for possibly identifying the sources of frequently-occurring glitches, such as blip glitches and scratchy glitches, is to further classify each glitch into a subclass, or family, of the larger classification. For example, general blip glitches are easily identified and classified through GravitySpy, but each blip glitch could be very different from the ones before it, and the large amount of blip glitches makes it difficult to figure out the source, or possibly multiple sources. While an attribute such as bandwidth could be used to sub-classify glitches within the existing GravitySpy algorithm (possibly utilizing the existing four different time frame spectrograms), there is a considerable number of possible glitch attributes that could be used for sub-classification, and may of them do not readily appear on a spectrogram. This suggests a larger re-working of the input of a training set of a particular classification of glitches to achieve sub-classification, rather than simply adding more possible outputs to the original network.
Once a training set is built and put through a network to sub-classify a specific type of glitch, such as blips or scratchies, there are multiple steps forward. One option is to further analyze the subclass and compare with auxiliary channels to try to find a source. Another option is to incorporate the sub-classification into the main GravitySpy network so that sub-classification occurs immediately for that type of glitch. Fortunately, both directions benefit the goal of understanding, classifying, and eliminating glitches in the LIGO strain data.
\bibliography{references}
\bibliographystyle{ieeetr}
\end{document}