This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Detecting Weak Spectral Lines in Interferometric Data through Matched Filtering

Ryan A. Loomis Harvard-Smithsonian Center for Astrophysics, Cambridge, MA 02138 Karin I. Öberg Harvard-Smithsonian Center for Astrophysics, Cambridge, MA 02138 Sean M. Andrews Harvard-Smithsonian Center for Astrophysics, Cambridge, MA 02138 Catherine Walsh Leiden Observatory, Leiden University, P.O. Box 9531, 2300 RA Leiden, The Netherlands School of Physics and Astronomy, University of Leeds, Leeds LS2 9JT, UK Ian Czekala KIPAC Postdoctoral Fellow Harvard-Smithsonian Center for Astrophysics, Cambridge, MA 02138 Stanford University, 452 Lomita Mall, Stanford, CA, 94305 Jane Huang Harvard-Smithsonian Center for Astrophysics, Cambridge, MA 02138 Katherine A. Rosenfeld Harvard-Smithsonian Center for Astrophysics, Cambridge, MA 02138
Abstract

Modern radio interferometers enable observations of spectral lines with unprecedented spatial resolution and sensitivity. In spite of these technical advances, many lines of interest are still at best weakly detected and therefore necessitate detection and analysis techniques specialized for the low signal-to-noise ratio (SNR) regime. Matched filters can leverage knowledge of the source structure and kinematics to increase sensitivity of spectral line observations. Application of the filter in the native Fourier domain improves SNR while simultaneously avoiding the computational cost and ambiguities associated with imaging, making matched filtering a fast and robust method for weak spectral line detection. We demonstrate how an approximate matched filter can be constructed from a previously observed line or from a model of the source, and we show how this filter can be used to robustly infer a detection significance for weak spectral lines. When applied to ALMA Cycle 2 observations of CH3OH in the protoplanetary disk around TW Hya, the technique yields a \approx53%\% SNR boost over aperture-based spectral extraction methods, and we show that an even higher boost will be achieved for observations at higher spatial resolution. A Python-based open-source implementation of this technique is available under the MIT license at https://github.com/AstroChem/VISIBLE.

software: Astropy (Astropy Collaboration et al., 2013), CASA (McMullin et al., 2007), casa-python, DiskJockey (Czekala et al., 2015; Czekala, 2016), LIME (Brinch & Hogerheijde, 2010), Matplotlib (Hunter, 2007), NumPy (Jones et al., 2001–), SciPy (van der Walt et al., 2011), VISIBLE (Loomis et al., 2018a, b)

1 Introduction

The rich spatio-kinematic information that radio interferometric datasets can provide for molecular spectral lines is crucial for studying the astrophysical and chemical processes occurring in host sources. The broadband capabilities of modern interferometers allow many spectral lines to be observed in a single correlator setup, enabling astronomers to simultaneously trace multiple astrophysical phenomena, or undertake unbiased line surveys to work toward complete molecular inventories (e.g. Jørgensen et al., 2011; Coutens et al., 2016; Müller et al., 2016). Within these datasets, many scientifically interesting lines may be weak or not detected due to low column densities or intrinsically low line strengths. Finding these lines and robustly assessing their strength is key to achieving many science goals.

Resolved interferometric observations pose special challenges to detecting weak spectral lines. Radio interferometers measure visibilities, samples of the Fourier transform of the distribution of emission intensities from an astrophysical source at discrete spatial and spectral frequencies. These visibilities are then Fourier inverted and deconvolved with a routine such as CLEAN (Högbom, 1974) to create an image cube. As shown in Fig. 1, this image cube consists of a series of images (channel maps) of the emission intensity distribution in distinct spectral frequency bins, which correspond to projected radial velocity bins. In the simplest line detection scenario, emission is directly observed in these channel maps.

Refer to caption
Figure 1: A diagram illustrating the multiple ways of viewing an image cube. Counter-clockwise from the top-left: Velocity-integrated moment maps, made by integrating slices of the cube along the frequency axis; channel maps, where each panel corresponds to a channel of the cube; spectra, generated from top to bottom from a single pixel, integrated over an aperture, and integrated using a matched spatio-kinematic mask (dashed red contours in channel maps). The synthesized beam is shown in the lower left of the moment and channel maps.

When emission is too weak to be directly visible in the channel maps, the image cube might be manipulated in a variety of ways to increase SNR. Spectra can be extracted from the cube, and moment maps can be generated by collapsing the cube along the spectral axis, illustrated in Fig. 1. All spectral extraction approaches incorporate a spatial mask. If the source is unresolved, a single pixel-extracted spectrum will contain all available information. In cases where the emission is extended and spatially resolved, the simplest mask that contains all emission is an aperture drawn around the source. Such a mask rarely results in spectra with optimal SNR, however. In sources with complex spatio-kinematic patterns, due to e.g. bulk rotation, emission may ‘move’ across the channel maps. The aperture mask is then larger than the emitting area in any given channel, adding noise to the extracted spectrum. To combat this, a spatio-kinematic mask specifically tailored to the structure of the source may be used to reduce the amount of added noise (e.g. Dutrey et al., 2007; Öberg et al., 2015; Loomis et al., 2015; Yen et al., 2016).

The application of spatio-kinematic masks to spectral image cubes has already enabled new science, but there are both computational and interpretive challenges when attempting to extend this technique to detect weak lines. First, the observed visibilities must be imaged, a non-trivial computational cost for high resolution observations or spectral surveys with large bandwidths. Second, when the visibilities are Fourier inverted, the PSF is oversampled with pixels to reduce imaging artifacts. This introduces a spatial covariance between pixels on the scale of the beam, making statistical interpretations of extracted spectra difficult. Finally, tailored spatio-kinematic masks reduce added noise but sacrifice a meaningful spectral baseline, making robust weak line detection difficult unless more complicated bootstrapping approaches are taken to establish a false positive rate (Barenfeld et al., 2016).

These obstacles can be overcome while retaining the benefits of the spatio-kinematic approach by applying a matched filter directly to the observed visibilities. When the shape of a signal is known, the optimal linear filter for signal extraction is a matched filter, equivalent to the known signal with a normalization constant. Cross-correlating a noisy signal with this filter maximizes the output SNR. This approach is used extensively in digital signal processing; prominent examples include RADAR (e.g. Woodward, 1953; Cumming & Wong, 2005, and references therein), source detection in imaging surveys (e.g. Bertin & Arnouts, 1996; Bertin, 2001; Meillier et al., 2016; Herenz & Wisotzki, 2017; Zackay & Ofek, 2017), gravitational wave detection (e.g. Owen & Sathyaprakash, 1999; Schutz, 1999; Abbott et al., 2016), and exoplanet detection through direct imaging (Ruffio et al., 2017).

Because matched filters are simply cross-correlated with the data, they can be easily applied in the Fourier domain to provide a fast and unbiased approach to weak line detection over broad bandwidths. An image-plane spectral extraction mask reduces unnecessary noise contributions by incorporating estimated spatio-kinematic (and correspondingly, interferometric phase) information into the extracted spectrum. Similarly, matched filtering quantitatively combines both amplitude and phase information of the observed visibilities into a robust detection probability. Line detection directly in the visibilities both avoids the high computational expense of fully imaging wide-bandwidth datasets and retains a straightforward statistical interpretation of detection significance.

In this paper, we describe how to construct and apply a matched filter to interferometric spectral line data and demonstrate the method on observations from the Atacama Large Millimeter/sub-millimeter Array (ALMA). In §2, we provide an overview of matched filtering and detail the steps of the method. In §3, we apply the technique to ALMA Cycle 2 observations from Walsh et al. (2016) of CH3OH in a protoplanetary disk. In §4, we discuss how much SNR boost might be expected for a given dataset, compare the technique to other methods, and suggest applications where matched filtering may prove useful. A summary is given in §5. Formulas to approximate the expected SNR boost are derived in Appendices A and B, derivations of noise covariance matrices for correlated channels are presented in Appendix C, and details of the example filters used in the paper are given in Appendix D.

2 Method

In this section we first present a brief overview of the principles behind matched filtering, introducing the one-dimensional matched filter. The one-dimensional approach is then easily extended to higher dimensional problems such as searching for signals within an image or image cube (e.g. Bertin & Arnouts, 1996; Feng et al., 2017; Herenz & Wisotzki, 2017; White & Padmanabhan, 2017). We present here a novel method to apply matched filters in the native measurement space of interferometric data, the incompletely sampled Fourier (u,v)(u,v) plane. After defining interferometric visibilities and their noise properties, we provide detailed instruction and examples for each of the steps in the method:

  1. 1.

    Generation of a (u,v)(u,v) plane filter which approximates the true emission pattern.

  2. 2.

    Cross-correlation of this filter with the measured visibilities.

  3. 3.

    Spectrum normalization and detection inference.

  4. 4.

    Line stacking (where applicable).

2.1 Matched Filtering

The matched filter can be derived in a number of ways. Here, we introduce its derivation by maximizing the SNR of a signal, but it can equivalently be interpreted as a least squares estimator (see e.g. Schwartz & Shaw, 1975; Vio & Andreani, 2016). In general, a signal 𝒔\bm{s} may be corrupted by additive white noise 𝒗\bm{v}, yielding an observation 𝒙=𝒔+𝒗\bm{x}=\bm{s}+\bm{v}. To maximize the SNR of this signal by applying a linear filter 𝒉\bm{h}, we can first write the SNR (using the definition of signal power/noise power) as

SNR=𝒉𝒔𝒔𝒉𝒉𝑹𝒗𝒉,\mathrm{SNR}=\frac{\bm{h}^{*}\bm{s}\bm{s}^{*}\bm{h}}{\bm{h}^{*}\bm{R_{v}}\bm{h}}, (1)

where denotes the conjugate transpose and 𝑹𝒗=E[𝒗𝒗]\bm{R_{v}}=E[\bm{v}\bm{v}^{*}] is a covariance matrix of the noise 𝒗\bm{v}, where E[]E[~] is the expectation operator. Under these conditions, the filter 𝒉\bm{h} which maximizes SNR is

𝒉=[1𝒔𝑹𝒗1𝒔]𝑹𝒗1𝒔=C𝑹𝒗1𝒔,\bm{h}=[\frac{1}{\sqrt{\bm{s}^{*}\bm{R_{v}}^{-1}\bm{s}}}]\bm{R_{v}}^{-1}\bm{s}=C\bm{R_{v}}^{-1}\bm{s}, (2)

i.e., the original signal 𝒔\bm{s} multiplied by the data weights 𝑹𝒗1\bm{R_{v}}^{-1} and a normalization constant C=1/𝒔𝑹𝒗1𝒔C=1/\sqrt{\bm{s}^{*}\bm{R_{v}}^{-1}\bm{s}} (Woodward, 1953; North, 1963; Cumming & Wong, 2005, e.g.).

A simple application of such a filter is locating a signal within a one dimensional dataset such as an emission spectrum. In this case, a short signal 𝒔\bm{s} of length nsn_{s} is embedded within a longer noisy observed spectrum 𝒙\bm{x} of length nxn_{x}, with the location of 𝒔\bm{s} within 𝒙\bm{x} unknown. As long as the shape of 𝒔\bm{s} is known, a filter kernel 𝒉\bm{h} can be calculated using equation 2 and cross-correlated with 𝒙\bm{x} to locate 𝒔\bm{s}. This cross-correlation is often thought of as a sliding dot product, and yields a one dimensional impulse response spectrum 𝑻\bm{T}, of length nxns+1n_{x}-n_{s}+1. Each element of 𝑻\bm{T} at a position i0i_{0} will then be:

Ti0=i=i0i0+Ns1xihii0;i0[0,nxns].T_{i_{0}}=\sum_{i=i_{0}}^{i_{0}+N_{s}-1}x_{i}h_{i-i_{0}};~~~~i_{0}\in[0,n_{x}-n_{s}]. (3)

This impulse response spectrum loses physical significance that the original observed spectrum held (it is no longer in units of power or flux). It instead encodes the degree of similarity between the observations and the filter at any given point in the observed spectrum. By projecting the noisy observations in this way, the total noise is decreased and the SNR of the signal is increased. Because the filter is linear, the Gaussian nature of the noise is preserved. The impulse response spectrum can be easily examined for evidence of 𝒔\bm{s}, with a detection threshold set to some multiple of the standard deviation of 𝑻\bm{T} (e.g. 4σ\sigma), or a false positive rate scaled with the variance of 𝑻\bm{T}.

2.2 Interferometric Visibilities

Each interferometric visibility, ViV_{i}, is measured as the complex product of the output of the first antenna of a baseline pair in the array with the complex conjugate of the output of the second antenna (see e.g. Thompson et al., 2017). The projected baseline distance between the two antennas then defines the location of the visibility on the (u,v)(u,v) plane. Each visibility ViV_{i} is associated with a unique weight wi=1/σi2w_{i}=1/\sigma_{i}^{2}, where σi2\sigma_{i}^{2} encodes the variance of ViV_{i} (see Appendix C for more details on data weights in interferometric datasets). In addition to being measured at discrete spatial frequencies, the visibilities are also measured at a series of spectral frequencies (channels). Cross-correlation in this discretely sampled three dimensional (u,v,channel)(u,v,channel) space is computationally awkward, but the dataset can be reshaped to a two-dimensional dataset of size (nuv,nc)(n_{uv},n_{c}), where each visibility row corresponds to a unique location on the (u,v)(u,v) plane in units of distance. Both the complex visibilities and their corresponding real weights are stored this way in the UVFITS111The UVFITS format definition can be found in AIPS Memo #114 at http://www.aips.nrao.edu/aipsmemo.html and Measurement Set (MS)222The MS format definition can be found at https://casa.nrao.edu/Memos/229.html formats of the Common Astronomy Software Applications package (CASA).

Transforming between visibility space and image space requires a gridding and deconvolution routine, such as CLEAN, in one direction and a visibility sampling routine in the other direction, such as uvmodel in MIRIAD, or simobserve in CASA. As using the full simobserve task is relatively slow and uvmodel is not easily interfaced with Python, we have written a Python based visibility sampling routine, vis_\textunderscoresample, which is able to interface with CASA MS and UVFITS formats. This package builds on an implementation of the sampling algorithm in the DiskJockey package (Czekala et al., 2015; Czekala, 2016) identical to that used in uvmodel and simobserve and uses the spheroidal gridding function approximations described by Schwab (1984). As identical algorithms and gridding functions are used, output from vis_\textunderscoresample is identical to output from uvmodel and simobserve. 333In addition to its utility for filter kernel generation, we note that vis_\textunderscoresample may be useful for visibility fitting of modern interferometric datasets (e.g. MacGregor et al., 2016; Loomis et al., 2017). vis_\textunderscoresample is publicly available under the MIT license at https://github.com/AstroChem/vis$_$sample or in the Anaconda Cloud at https://anaconda.org/rloomis/vis$_$sample

2.3 Filter Kernel Generation

The principle assumption of a matched filter analysis is that the shape of the signal 𝒔\bm{s} is known, or can be reasonably approximated. In traditional applications, such as RADAR, the outbound signal is user-generated and therefore the exact form is known. In astronomical applications, however, the ideal matched filter kernel is unknown and must be approximated. As it is unknown how closely the filter approximates the true signal, any derived detection significance will be a lower limit. The method is relatively robust to choice of filter, however, as long as the filter is a reasonable approximation of the source spatio-kinematic structure.

We suggest two possible approaches: (1) calculating a kernel from a model of the source (model-driven), or (2) calculating a kernel from prior observations of strong emission lines (data-driven). In both cases the kernel is first constructed in the image plane and then Fourier transformed and visibility sampled to match the (u,v)(u,v) coverage of the observations. The approximated signal 𝒇\bm{f} and the inverted noise covariance matrix 𝑹𝒗𝟏\bm{R_{v}^{-1}} (calculated from the observational data weights, see Appendix C) are then used to compute the full filter kernel, including the normalization prefactor:

𝒉=[1𝒇𝑹𝒗1𝒇]𝑹𝒗1𝒇.\bm{h}=[\frac{1}{\sqrt{\bm{f}^{*}\bm{R_{v}}^{-1}\bm{f}}}]\bm{R_{v}}^{-1}\bm{f}. (4)

Fig. 2 presents three examples of the different filter kernel estimation approaches. First, for objects such as protoplanetary disks or galaxies, the source inclination and position angle are often well-known and a spatio-kinematic model of the gas can be approximated. In the top panels of Fig. 2 we have generated a Keplerian mask for molecular emission from the protoplanetary disk around TW Hya. Alternatively, a more detailed filter kernel can be generated from an astrochemical model of the source, with emission calculated using a radiative transfer code such as RADMC-3D (Dullemond, 2012) or LIME (Brinch & Hogerheijde, 2010). An example is shown in the middle panels of Fig. 2, generated from the parametric CH3OH abundance model in Walsh et al. (2016). Details of the model are presented in Appendix D.

Refer to caption
Figure 2: Three examples of channel maps used to generate filter kernels for emission from the protoplanetary disk around TW Hya. Top: a simple kernel based on Keplerian rotation. Middle: a kernel based on a parametric model of CH3OH from Walsh et al. (2016). Bottom: a data-driven kernel generated from H2CO observations from Öberg et al. (2017). All kernels have 0.2 km s-1 channels and are normalized by their peak intensities.

In the data-driven approach, an assumption is made that an observed molecular transition shares its spatio-kinematic pattern with the desired weak line. The filter will be most effective when the template and target lines have well-matched spatial distributions, e.g. if the two lines are a strong and a weak line, respectively, of the same molecule. In Carney et al. (2017), we used this approach to detect weak H2CO lines in HD 163296, using a stronger H2CO line as a data-driven filter (see §4.1 and §4.4). Similarly, lines of a known species can be used as a filter for an undetected but chemically related molecule that is presumed to be co-spatial. The bottom row panels of Fig. 2 present observations of H2CO around TW Hya (Öberg et al., 2017) which could be used as a filter for CH3OH emission, due to their linked formation pathways (e.g., Cuppen et al., 2009; Qi et al., 2013; Walsh et al., 2014; Loomis et al., 2015). Details of the observations and kernel generation are presented in Appendix D.

2.4 Computing the Impulse Response Function

Refer to caption
Figure 3: A diagram illustrating how the filter kernel is cross-correlated with the data to produce a spectrum of the impulse response to the filter. The kernel is shown on the left, with dimensions of nkn_{k} channels horizontally and nuvn_{uv} visibilities vertically (not shown to scale). Several representative channels of the kernel are shown imaged. The amplitudes of the complex kernel have been binned and pixelated to be visually intuitive. The complex data shown in gray-scale is also binned and pixelated, and has an identical number of visibilities, but ncnkn_{c}\gg n_{k}. The filter is applied to the data as a sliding inner product, and three illustrative regions are shown to visualize the cross-correlation at various points. Within these regions, a stronger red color signifies a stronger correlation with the corresponding kernel value, and the real portion of the response is summed over the entire region to produce the corresponding impulse response for each channel, with the line detected in the central channel.

Fig. 3 schematically diagrams how these filter kernels would be applied to the data to produce an impulse response spectrum. First, the image plane kernel is Fourier transformed and visibility sampled to produce a complex (u,v)(u,v) plane kernel 𝒇\bm{f} of size (nuv,nk)(n_{uv},n_{k}). The inverted noise covariance matrix 𝑹𝒗𝟏\bm{R_{v}^{-1}} is calculated from the data weights 𝒘\bm{w} and combined with 𝒇\bm{f} using Eq. 4 to produce the full filter kernel 𝒉\bm{h}. This kernel is then cross-correlated with the data 𝑽\bm{V}, of size (nuv,nc)(n_{uv},n_{c}). The kernel and the data both have the same number of visibilities, nuvn_{uv}, but different numbers of channels, and the kernel slides through the data along the spectral axis. At each channel, the filter impulse response spectrum 𝑻\bm{T} is calculated by taking the complex inner product of the windowed data with the kernel:

Ti0=i=i0i0+nk1j=0nuvVi,jhii0,j;i0[0,ncnk].T_{i_{0}}=\sum_{i=i_{0}}^{i_{0}+n_{k}-1}\sum_{j=0}^{n_{uv}}V_{i,j}h_{i-i_{0},j};~~~i_{0}\in[0,n_{c}-n_{k}]. (5)

As the data is complex, the filter response spectrum is also complex with a normalized total noise power. Signal power will leak from the real to the imaginary portion of the response, however, if there is a phase misalignment between the sky locations of the filter and the source. Thus if the filter has been properly phase shifted to be aligned with the source, the resultant impulse response spectrum 𝑻\bm{T} can be written as:

Ti0=Re[2i=i0i0+nk1j=0nuvVi,jhii0,j];i0[0,ncnk],T_{i_{0}}=\mathrm{Re}\bigg{[}\sqrt{2}\sum_{i=i_{0}}^{i_{0}+n_{k}-1}\sum_{j=0}^{n_{uv}}V_{i,j}h_{i-i_{0},j}\bigg{]};~~~i_{0}\in[0,n_{c}-n_{k}], (6)

with the factor of 2\sqrt{2} introduced to normalize the noise power in the real portion of the spectrum.

This method of calculating the cross-correlation is conceptually simple, but computationally inefficient. Computing inner products of the windowed data requires either manipulation of the (very large) dataset in memory or non-sequential memory access, preventing speed increases through vectorization.444Using FFT cross-correlation is even slower for typical interferometric dataset sizes. There is no restriction, however, on the order of operations in which the inner products are internally calculated. We use this to our advantage and treat the partial two-dimensional cross-correlation as a series of nuvn_{uv} one dimensional cross-correlations along the spectral axis, yielding nuvn_{uv} individual impulse response curves. The UVFITS and MS data formats store visibilities in a row-major order such that these one-dimensional cross-correlations quickly access data sequentially in memory. The resulting impulse response curves are then summed along the spatial frequency dimension, identical to Eq. 6, but with the order of the summations switched,

Ti0=Re[2j=0nuvi=i0i0+nk1Vi,jhii0,j];i0[0,ncnk],T_{i_{0}}=\mathrm{Re}\bigg{[}\sqrt{2}\sum_{j=0}^{n_{uv}}\sum_{i=i_{0}}^{i_{0}+n_{k}-1}V_{i,j}h_{i-i_{0},j}\bigg{]};~~~i_{0}\in[0,n_{c}-n_{k}], (7)

yielding a final impulse response spectrum identical to that from the sliding window method shown in Fig. 3. The nuvn_{uv} one-dimensional cross-correlations are independent, making parallelization trivial. Using this approach, the full bandwidth of a typical ALMA spectral window (e.g. 3840 channels over a 1 hr integration with 43 antennas) can be filtered very quickly on a desktop (e.g. a few seconds on a quad-core 3.3GHz processor).

2.5 Normalization and Detection Inference

Assessing the probability of a line detection from the filter impulse response requires understanding the noise properties of the response spectrum, which no longer holds the same physical significance as an emission spectrum. The response spectrum at a given frequency now represents how closely the data correspond to the filter, rather than the flux at that frequency. As the filter is linear, uncorrelated thermal noise in the visibilities manifests as Gaussian noise in the filter response. If the data weights are properly calibrated (see Appendix C), the prefactor in Eq. 4 will normalize the filter response such that the spectrum has units of standard deviations (σ\sigma) with a RMS noise level of unity.

In practice, however, we have found that the calculated data weights are often only accurate to \sim20% compared to the actual variance of the visibilities. Thus we strongly recommend comparing the data weights and visibility scatter and recalculating the weights using a task such as statwt in CASA if there is a discrepancy. Alternatively, the filter response itself can be manually normalized by dividing by the standard deviation of the spectrum (excluding any channels with obvious signal).

Additionally, the linear nature of the filter means that any unsubtracted continuum emission will result in a constant offset in the response spectrum. A model of the continuum can either be subtracted in the (u,v)(u,v) plane as a pre-processing step prior to filtering using a task such as uvcontsub in CASA, or be subtracted after filtering by subtracting the mean of signal-free channels in the impulse response spectrum. Subtraction after filtering removes the possibility of small inaccuracies in the (u,v)(u,v) subtracted model, to which the filter will be sensitive. Conversely, subtraction prior to filtering may be more convenient when a large bandwidth is covered for a source with a non-zero spectral index.

Once the response spectrum is normalized and any offsets are removed, peaks can then be evaluated against a detection threshold, set at some number of standard deviations corresponding to an acceptable false alarm rate. We stress, however, that the detection significance is a lower limit, as it is unknown how closely the filter approximates the ideal matched filter.

2.6 Comparison to Image-Plane Spectral Extraction

As a proof of concept, we apply the method to synthetic observations of CH3OH emission in TW Hya and compare to an aperture-based spectral extraction in the image plane. The modeled emission from the middle panels of Fig. 2 is used to generate both the observations (with noise added) and the filter kernel. As this is a true matched filter, it provides a useful benchmark for comparison with the approximate filter results as presented in §3.

Refer to caption
Figure 4: Comparison of the ideal matched filter with conventional spectral extraction through aperture masking. Panel a: Moment-0 map of simulated, noiseless CH3OH 312-303 emission. The synthesized beam is shown in the lower left. Contours are [-3, -1.5, 1.5, 3]×\times3.3 mJy bm-1 km s-1, corresponding to 1σ\sigma in panel b. Panel b: Moment-0 map of simulated and noise-corrupted CH3OH emission. Panel c: Spectrum of the noise-corrupted emission, extracted using an aperture 3\arcsec in diameter. Panel d: Ideal matched filter response to the noisy emission, with units of σ\sigma.

A synthetic measurement set of observations was created from the CH3OH emission cube described in §2.3 by visibility sampling at baselines corresponding to the observations from Walsh et al. (2016) using vis_\textunderscoresample. The complex visibilities were then noise corrupted such that the rms noise was 5 mJy bm-1 across each 0.15 km/s channel, equivalent to the Walsh et al. (2016) observations. The noiseless and noisy measurement sets were imaged in CASA using the CLEAN task, with a CLEAN mask generated from the LIME output emission profile and a circular 1\arcsec FWHM Gaussian taper applied in the Fourier plane to increase the SNR of the images. Only the noiseless measurement set was CLEANed; the noisy measurement set was dirty imaged to prevent bias from over-CLEANing, as the emission is practically at the noise limit in any given channel. Integrated intensity (moment-0) maps of the noiseless and noisy 312-303 transitions are shown in Fig. 4, panels a and b, respectively, and were generated by integrating all channels with emission. No clipping threshold was used. In the noisy case, the moment-0 map rms is \sim3.3 mJy bm-1 km s-1 and the peak integrated flux is \sim13.2 mJy bm-1 km s-1, yielding a SNR of \sim4. A spectrum was extracted from the noisy image cube using an aperture 3\arcsec in diameter, equivalent to the extent of the CH3OH emission (Fig. 4, panel c). The spectrum has a peak flux of \sim11.4 mJy and a rms noise of \sim3.2 mJy, yielding a SNR of \sim3.5σ\sigma. The rms noise level of the noisy spectrum was estimated from all channels without significant emission (i.e., excluding a velocity range of ±\pm 1.5 km s-1 around the systemic velocity of 2.8 km s-1).

We cross-correlate the CH3OH filter kernel with the synthetic observations, as described in §2.4, generating the filter response shown in Fig. 4 panel d. The peak value of the filter response, 5.7σ\sigma, is the maximum SNR extractable from the data and represents a \sim40%\% and \sim60%\% improvement over the moment-0 and spectral detections, respectively. This already corresponds to a factor of 2–3 increase in effective observing time, but as discussed in both §4.1 and Appendices A & B, the level of possible SNR improvements will be higher for data sets that are better-resolved.

2.7 Stacking

Spectral stacking is a common method of SNR improvement for observations of multiple transitions of the same molecule (e.g. Langston & Turner, 2007; Kalenskii & Johansson, 2010; Loomis et al., 2016; Walsh et al., 2016). If the excitation conditions of two or more transitions are similar and their rest frequencies are well known, then the signals can be combined through weighted averaging,

𝑻𝒔=i=0ns𝑻𝒊wi,\bm{T_{s}}=\sum_{i=0}^{n_{s}}\bm{T_{i}}w_{i}, (8)

where the stacked spectrum 𝑻𝒔\bm{T_{s}} is generated by summing nsn_{s} individual spectra 𝑻𝒊\bm{T_{i}} multiplied by weights wiw_{i}, proportional to the SNR of each 𝑻𝒊\bm{T_{i}}. Knowledge of the relative strengths of each transition is therefore important to gain the most signal improvement. Application of a matched filter results in an estimated SNR for each transition, which can be used as a proxy for their relative strengths. The resultant impulse response spectra are then easily stacked to generate an appropriately weighted stacked spectrum.

Refer to caption
Figure 5: Demonstration of line stacking on synthetic CH3OH emission. Panel a-c: Moment-0 maps of the three simulated and noise-corrupted transitions, 211-202, 312-303, and 413-404. Contours are [-3, -1.5, 1.5, 3]×σ\times\sigma, σ\sigma=3.3 mJy bm-1 km s-1. The synthesized beam is shown in the lower left. Panel d-f: Ideal matched filter response spectra. Peak SNRs are 8.4, 5.7, and 4.9σ\sigma. Panel g: Filter response spectrum created by stacking the individual spectra from panels d-f. Each spectrum was weighted by its SNR, and the resultant spectrum has a SNR of 11.6.

To illustrate this process, we have repeated the simulations and filtering described in §2.6 for three CH3OH transitions: 211-202, 312-303, and 413-404, with relative strengths of 1.8:1.3:1.0. Moment-0 maps of the emission from each of these transitions are shown in Fig. 5, panels a, b, and c, with peak integrated fluxes of 11.7, 13.2, and 8.5 mJy km s-1 and corresponding SNRs of 3.5, 4, and 2.6σ\sigma, respectively. The individual filter responses are shown in Fig. 5, panels d, e, and f, with peak SNRs of 8.4, 5.7, and 4.9σ\sigma, respectively. The filter responses were stacked using a weighted average, yielding the spectrum shown in Fig. 5, panel g, with a peak SNR of 11.6σ\sigma. The ratio of the filter responses \sim(1.7:1.2:1) recovers the flux ratio of the input models (1.8:1.3:1.0) fairly well, even though the 211-202 transition appears weaker than would be expected in the imaged data (likely due to random noise fluctuations in the inherently more noisy moment-0 maps). This highlights one of the advantages of applying the matched filter in the Fourier plane.

3 Application to real ALMA data

Refer to caption
Figure 6: CH3OH observations toward TW Hya. Top: CH3OH emission from the 211-202, 312-303, and 413-404 transitions, and all three stacked. Contours are [-3, -1.5, 1.5, 3, 4.5]×σ\times\sigma, σ\sigma=\sim3.6 mJy bm-1 km s-1 for the individual transitions and \sim2.3 mJy bm-1 km s-1 for the stacked image. Bottom: same as top, but for 1 km s-1 velocity bins around the source velocity, showing the disk rotation.

Matched filtering provides clear benefits when the ideal filter kernel is known. To explore its utility when the filter is approximated, we have applied the method to real ALMA Band 7 observations of CH3OH toward TW Hya (project 2013.1.00902.S, P.I. C. Walsh), using all three kernels shown in Fig. 2. Details of the CH3OH observations are presented by Walsh et al. (2016). They reported that the three observed CH3OH transitions (211-202, 312-303, and 413-404) were not conclusively detected in any of the individual data cubes, and therefore only presented the stacked imaging data with a 5.1σ\sigma detection in an aperture extracted spectrum. Moment-0 maps of the three observed CH3OH transitions are presented in Fig. 6, with peak SNRs of 4.3, 4.3, and 2.9σ\sigma, respectively. A stacked moment-0 map is shown on the far right with a peak SNR of 4.8σ\sigma. The lower set of panels in Fig. 6 show binned red and blue-shifted emission, highlighting the disk rotation. Rotation about the principal axis is seen for the stacked emission, and hinted at for two of the individual transitions.

Refer to caption
Figure 7: Filter response spectra for each CH3OH transition. The impulse responses to the Keplerian, CH3OH model, and H2CO filters are shown in green, red, and blue, respectively.

Each of the three filter kernels from Fig. 2 were cross-correlated with the visibilities of each of the observed CH3OH transitions, producing the filter responses shown in Fig. 7. All three filters detect the individual lines and show similar SNR boosts, demonstrating that the method is robust to the choice of filter. The H2CO filter yields the strongest responses for the individual lines (4.3, 6.0, and 3.4σ\sigma, respectively). Stacking these spectra together, the H2CO filter yields a robust detection of 7.8σ\sigma, a 53%\% improvement over the 5.1σ\sigma detection reported in Walsh et al. (2016). The Keplerian and CH3OH model filters produce stacked responses of 7.4σ\sigma and 7.7σ\sigma, respectively.

Refer to caption
Figure 8: Comparison of matched filtering with traditional methods, as in Fig. 4 but at higher spatial resolution. Panel a: Moment-0 map of simulated noiseless CH3OH 312-303 emission. The synthesized beam is shown in the lower left. Contours are [-3, -1.5, 1.5, 3]×\times5.6 mJy bm-1 km s-1, corresponding to 1σ\sigma in panel b. Panel b: Moment-0 map of simulated and noise-corrupted CH3OH emission. Panel c: Spectrum of the noise-corrupted emission, extracted using an aperture 3\arcsec in diameter. Panel d: Ideal matched filter response to the noisy emission. Units are σ\sigma, defined as the rms filter response in channels with no emission.

4 Discussion

We have presented a formulation of matched filtering for interferometric spectral line data and shown that this technique can improve SNR and therefore line detectability in both synthetic and real test cases. We now discuss how much of a SNR boost one might expect for a given dataset, compare to alternative techniques, and suggest potential further applications of this method.

4.1 Factors Affecting SNR Boost

Compared with traditional line detection methods, the matched filter approach offers an improved SNR. The degree of SNR boost depends on both the accuracy of the approximated kernel as well as the specific properties of the data (particularly the spatial resolution). In the synthetic and real data test cases presented in §3, we found that application of a matched filter could increase SNR by \sim60%. The method was found to be similarly effective when applied to real data (\sim53% vs \sim60% boost), demonstrating that it is robust to the choice of approximated filter.

Intuitively, the SNR boost and spatial resolution of the emission should be coupled. By definition, a spatially unresolved signal encodes no spatio-kinematic information, and in this limit the matched filter technique will provide no increase in SNR other than the boost from spectral averaging. As emission is spatially resolved, SNR will decrease roughly with the square of the degree of spatial resolution (source width / beam size), with additional losses due to spatial filtering (see e.g. Crane & Napier, 1986). With appropriate knowledge of the velocity structure, a matched filter essentially negates this effect, and thus the SNR boost scales directly with the spatial resolution of the signal (see Yen et al., 2016, for a detailed image-plane derivation of this SNR boost). Fig. 8 illustrates this effect, with an identical simulation to that shown in Fig. 4, but with a higher spatial resolution of \sim0.\farcs3. The data were noise corrupted to reach a similar \sim4σ\sigma detection in the moment map, although the SNR in the extracted spectrum is now \sim6σ\sigma, highlighting how ineffective moment maps are at high spatial resolutions. The filter response is also now much larger (SNR=23.6σ\sigma), yielding a SNR boost over the aperture extracted spectra of \sim400%, compared to \sim60% in Fig. 4. Similarly, application of matched filtering to higher resolution observations of H2CO (Carney et al., 2017) produced a SNR gain of over 500%, confirming in practice the relationship between SNR boost and spatial resolution.

4.2 Comparison to Other Methods

Recently Matrà et al. (2015) and Marino et al. (2016) introduced an image-plane line detection technique (also independently introduced and formalized by Yen et al., 2016) that provides some similar benefits to the matched filter technique. In their approaches, pixels from a dirty image are adjusted for an assumed velocity offset (from a source model), and the velocity corrected spectra are then stacked. In many ways, this can be seen as an image-plane analog to a Fourier plane matched filter, and it should yield comparable increases in SNR (see Appendix B for more details). This is confirmed by comparing the results of the matched filtering technique to detect H2CO in HD 163296 (Carney et al., 2017) with those obtained on the same dataset by Yen et al. (2016). In both cases, a SNR boost of \sim500% is achieved.

Several subtle differences between matched filtering and pixel stacking, however, may motivate their use in a synergistic fashion. First, application of a matched filter in the uv-plane requires no imaging of the data, and is therefore much faster and more robust than image-plane spectral stacking. Second, the matched filter technique allows for a more accurate emission model than simple Keplerian rotation to be applied to the data (i.e., the spatial distribution of molecular signal can be properly used for weighting). On the other hand, stacking pixels in the image plane makes extracting a flux measurement for the line much simpler and allows for a radial profile to be estimated (Yen et al., 2016). The two techniques could therefore be used sequentially to exploit the benefits of each, with a matched filter first used to quickly identify and confirm line detections in a dataset and pixel stacking then used to better characterize the lines.

4.3 Line Flux Estimation

The main utility of matched filtering when applied to interferometric spectral line data is in the rapid detection of weak lines, rather than their detailed characterization. Once a line is identified, it might be further characterized through careful imaging, spectral stacking, or model-fitting to the visibilities. For the weakest lines, however, detailed characterization will likely require additional observations. Matched filtering provides useful predictive utility when planning these observations, robustly confirming weak lines which might be desirable targets.

In particular, after a weak line is identified, the matched filter method can be used to estimate a line flux if the emission is too weak to be seen directly in the image cube. When a data-driven approach is taken, the responses of the target and template lines to the filter can be compared. If the two lines have a similar emission morphology, the ratio of their responses will be similar to the ratio of their fluxes, with the flux of the strong line being easy to measure.

This can be proven by considering a modified version of equation 1, writing down the SNR using the signal/rms definition:

SNRs=𝒉𝒔𝒔𝒉𝒉𝑹𝒗𝒉\mathrm{SNR_{s}}=\sqrt{\frac{\bm{h}^{*}\bm{s}\bm{s}^{*}\bm{h}}{\bm{h}^{*}\bm{R_{v}}\bm{h}}} (9)

We can treat the two lines as signals 𝒚\bm{y} and 𝒔\bm{s}, where 𝒚\bm{y} differs only from 𝒔\bm{s} by an arbitrary constant α\alpha, i.e. 𝒚=α𝒔\bm{y}=\alpha\bm{s}. The SNR of 𝒚\bm{y} after filter application is:

SNRy=𝒉α𝒔α𝒔𝒉𝒉𝑹𝒗𝒉\mathrm{SNR_{y}}=\sqrt{\frac{\bm{h}^{*}\alpha\bm{s}\alpha\bm{s}^{*}\bm{h}}{\bm{h}^{*}\bm{R_{v}}\bm{h}}} (10)

which reduces to:

SNRy=α𝒉𝒔𝒔𝒉𝒉𝑹𝒗𝒉\mathrm{SNR_{y}}=\alpha\sqrt{\frac{\bm{h}^{*}\bm{s}\bm{s}^{*}\bm{h}}{\bm{h}^{*}\bm{R_{v}}\bm{h}}} (11)
SNRy=αSNRs\mathrm{SNR_{y}}=\alpha~\mathrm{SNR_{s}} (12)

Therefore we can use the filter impulse response ratio to roughly estimate the flux, with the accuracy dependent on the closeness of the filter kernel to the true emission distribution. It is important to note that this estimate will always be a lower limit. An upper limit can additionally be derived from the imaged weak line, bounding the flux measurement. This approach was used by Carney et al. (2017) to determine the flux ratio of multiple detected H2CO lines, enabling them to constrain the H2CO excitation temperature.

4.4 Application to Line Surveys

In addition to aiding the detection of specific known weak lines, interferometric matched filtering provides substantial benefits for the processing of spectral line surveys where source locations and approximate spatio-kinematic structures are known. Imaging the full bandwidth of these large datasets at their native spectral resolution is a time consuming process, often taking many hours or even days. Because much of the information in these datasets is contained in spatio-kinematic patterns of the spectral lines, decreasing spectral resolution through channel averaging is typically not a viable option and can result in signal loss. A choice must therefore be made between imaging only small targeted windows of the broadband dataset, or spending time and computing resources on imaging the full bandwidth. For sparsely populated line surveys (e.g., of protoplanetary disks), imaging the entire data set is inherently inefficient, since most of the channels do not contain signal. Conversely, selective imaging reduces the likelihood of serendipitously detecting weak species, and conflicts with the motivations of an unbiased survey.

Numerous tools have been developed to aid in identifying spectral line emission in broadband datasets from current and future instruments such as ALMA, ASKAP, VLA, SKA, and the ngVLA (e.g., Koribalski, 2012; Whiting, 2012; Whiting & Humphreys, 2012; Friedel et al., 2015; Serra et al., 2015), but these methods often rely on a fully imaged datacube as input. In instances where the locations of the sources being targeted are known a-priori, our presently described method of matched filtering can help streamline this process by quickly and robustly identifying lines in the native visibilities. Then only these lines need be imaged and analyzed. In sources with a single dominant velocity pattern, a strong line could be imaged, converted to a filter kernel, and cross-correlated through the entire dataset in a small fraction of the time it would take to image that same dataset. The resulting full-band impulse response spectrum then provides a convenient first look at the dataset, guiding the observer as to which sections of the data are worth windowing out for further imaging and analysis. In particular, matched filtering will highlight weak lines that the observer would have missed even in a careful CLEAN of the data.

We note that our (u,v)(u,v)-plane method could likely be extended to full 3D searches in blind surveys, where source locations are not known a-priori, but such an implementation is beyond the scope of this paper. It is not immediately clear whether the large speed benefits of (u,v)(u,v)-plane analysis over full survey imaging would be maintained in such a 3D search space, and we encourage further research in this area.

5 Conclusion

We have shown that the technique of matched filtering can be easily implemented for analyzing interferometric observations of spectral lines, significantly improving sensitivity when searching for weak lines. An open-source Python-based implementation is freely available under the MIT license at https://github.com/AstroChem/VISIBLE. As a case study, we have focused on observations of protoplanetary disks with ALMA, but our approach is applicable to spectral line data of any astronomical source with a spatio-kinematic pattern that can be used to generate a filter kernel, and will likely be beneficial for spectral line observations from a wide range of current and future instruments (e.g., the SKA and the ngVLA).

We find that when applied to real data, the method results in large sensitivity increases, ranging from 53% for CH3OH in Walsh et al. (2016), to \sim500% for H2CO in Carney et al. (2017). The degree of sensitivity boost is proportional to the spatial resolution of the observations. These sensitivity increases are equivalent to factors of 2-25 in effective observing time, allowing observers to better leverage limited telescope resources. Additionally, the speed of the technique is beneficial when analyzing large bandwidth line surveys, robustly identifying all lines in a spectrum in a small fraction of the time it would take to image the same dataset. Finally, the method works synergistically with the methods presented in Matrà et al. (2015) and Yen et al. (2016) and tools such as ADMIT, forming a comprehensive suite of analysis techniques for spectral lines in large interferometric datasets.

We would like to thank Andrew Vanderburg and Peter Williams for productive discussions and assistance with software package management and distribution. We also thank the anonymous referee for providing comments that greatly improved the quality of the manuscript. RAL and JH gratefully acknowledge funding from National Science Foundation Graduate Research Fellowships (Grant No. DGE-1144152). RAL also acknowledges funding from the NRAO Student Observing Support Program. KIO acknowledges funding from the Alfred P. Sloan Foundation and the David and Lucile Packard Foundation. CW acknowledges financial support from the Netherlands Organisation for Scientific Research (NWO, grant 639.041.335) and start-up funds from the University of Leeds, UK. The National Radio Astronomy Observatory is a facility of the National Science Foundation operated under cooperative agreement by Associated Universities, Inc. This paper makes use of the following ALMA data: ADS/JAO.ALMA#2013.1.00902.S and ADS/JAO.ALMA#2013.1.00114.S. ALMA is a partnership of ESO (representing its member states), NSF (USA) and NINS (Japan), together with NRC (Canada) and NSC and ASIAA (Taiwan), in cooperation with the Republic of Chile. The Joint ALMA Observatory is operated by ESO, AUI/NRAO and NAOJ.

References

  • Abbott et al. (2016) Abbott, B. P., Abbott, R., Abbott, T. D., et al. 2016, Physical Review Letters, 116, 061102
  • Andrews et al. (2012) Andrews, S. M., Wilner, D. J., Hughes, A. M., et al. 2012, ApJ, 744, 162
  • Astropy Collaboration et al. (2013) Astropy Collaboration, Robitaille, T. P., Tollerud, E. J., et al. 2013, A&A, 558, A33
  • Barenfeld et al. (2016) Barenfeld, S. A., Carpenter, J. M., Ricci, L., & Isella, A. 2016, ApJ, 827, 142
  • Bertin (2001) Bertin, E. 2001, in Mining the Sky, ed. A. J. Banday, S. Zaroubi, & M. Bartelmann, 353
  • Bertin & Arnouts (1996) Bertin, E., & Arnouts, S. 1996, A&AS, 117, 393
  • Briggs (1995) Briggs, D. S. 1995, in Bulletin of the American Astronomical Society, Vol. 27, American Astronomical Society Meeting Abstracts, 1444
  • Brinch & Hogerheijde (2010) Brinch, C., & Hogerheijde, M. R. 2010, A&A, 523, A25
  • Carney et al. (2017) Carney, M. T., Hogerheijde, M. R., Loomis, R. A., et al. 2017, ArXiv e-prints, arXiv:1705.10188
  • Coutens et al. (2016) Coutens, A., Jørgensen, J. K., van der Wiel, M. H. D., et al. 2016, A&A, 590, L6
  • Crane & Napier (1986) Crane, P. C., & Napier, P. J. 1986, in Synthesis Imaging, ed. R. A. Perley, F. R. Schwab, & A. H. Bridle, 87–108
  • Cumming & Wong (2005) Cumming, I. G., & Wong, F. H. 2005, Artech house, 1, 3
  • Cuppen et al. (2009) Cuppen, H. M., van Dishoeck, E. F., Herbst, E., & Tielens, A. G. G. M. 2009, A&A, 508, 275
  • Czekala (2016) Czekala, I. 2016, DiskJockey: Protoplanetary disk modeling for dynamical mass derivation, Astrophysics Source Code Library, , , ascl:1603.011
  • Czekala et al. (2015) Czekala, I., Andrews, S. M., Jensen, E. L. N., et al. 2015, ApJ, 806, 154
  • Dullemond (2012) Dullemond, C. P. 2012, RADMC-3D: A multi-purpose radiative transfer tool, Astrophysics Source Code Library, , , ascl:1202.015
  • Dutrey et al. (2007) Dutrey, A., Henning, T., Guilloteau, S., et al. 2007, A&A, 464, 615
  • Feng et al. (2017) Feng, L., Vaulin, R., Hewitt, J. N., et al. 2017, AJ, 153, 98
  • Friedel et al. (2015) Friedel, D., Looney, L., Xu, L., et al. 2015, in 70th International Symposium on Molecular Spectroscopy
  • Herenz & Wisotzki (2017) Herenz, E. C., & Wisotzki, L. 2017, A&A, 602, A111
  • Högbom (1974) Högbom, J. A. 1974, A&AS, 15, 417
  • Hughes et al. (2011) Hughes, A. M., Wilner, D. J., Andrews, S. M., Qi, C., & Hogerheijde, M. R. 2011, ApJ, 727, 85
  • Hunter (2007) Hunter, J. D. 2007, Computing in Science Engineering, 9, 90
  • Jones et al. (2001–) Jones, E., Oliphant, T., Peterson, P., et al. 2001–, SciPy: Open source scientific tools for Python, , , [Online; accessed ¡today¿]. http://www.scipy.org/
  • Jørgensen et al. (2011) Jørgensen, J. K., Bourke, T. L., Nguyen Luong, Q., & Takakuwa, S. 2011, A&A, 534, A100
  • Kalenskii & Johansson (2010) Kalenskii, S. V., & Johansson, L. E. B. 2010, Astronomy Reports, 54, 295
  • Koribalski (2012) Koribalski, B. S. 2012, PASA, 29, 359
  • Langston & Turner (2007) Langston, G., & Turner, B. 2007, ApJ, 658, 455
  • Loomis et al. (2018a) Loomis, R., Öberg, K., Andrews, S., et al. 2018a, VISIBLE: VISIbility Based Line Extraction, v0.1.0, Zenodo, doi:10.5281/zenodo.1174549. https://doi.org/10.5281/zenodo.1174549
  • Loomis et al. (2018b) —. 2018b, VISIBLE: VISIbility Based Line Extraction, , , ascl:1802.006
  • Loomis et al. (2015) Loomis, R. A., Cleeves, L. I., Öberg, K. I., Guzman, V. V., & Andrews, S. M. 2015, ApJ, 809, L25
  • Loomis et al. (2017) Loomis, R. A., Öberg, K. I., Andrews, S. M., & MacGregor, M. A. 2017, ApJ, 840, 23
  • Loomis et al. (2016) Loomis, R. A., Shingledecker, C. N., Langston, G., et al. 2016, MNRAS, 463, 4175
  • MacGregor et al. (2016) MacGregor, M. A., Lawler, S. M., Wilner, D. J., et al. 2016, ApJ, 828, 113
  • Marino et al. (2016) Marino, S., Matrà, L., Stark, C., et al. 2016, MNRAS, 460, 2933
  • Matrà et al. (2015) Matrà, L., Panić, O., Wyatt, M. C., & Dent, W. R. F. 2015, MNRAS, 447, 3936
  • McMullin et al. (2007) McMullin, J. P., Waters, B., Schiebel, D., Young, W., & Golap, K. 2007, in Astronomical Society of the Pacific Conference Series, Vol. 376, Astronomical Data Analysis Software and Systems XVI, ed. R. A. Shaw, F. Hill, & D. J. Bell, 127
  • Meillier et al. (2016) Meillier, C., Chatelain, F., Michel, O., et al. 2016, A&A, 588, A140
  • Müller et al. (2016) Müller, H. S. P., Belloche, A., Xu, L.-H., et al. 2016, A&A, 587, A92
  • North (1963) North, D. O. 1963, Proceedings of the IEEE, 51, 1016
  • Öberg et al. (2015) Öberg, K. I., Guzmán, V. V., Furuya, K., et al. 2015, Nature, 520, 198
  • Öberg et al. (2017) Öberg, K. I., Guzmán, V. V., Merchantz, C. J., et al. 2017, ApJ, 839, 43
  • Owen & Sathyaprakash (1999) Owen, B. J., & Sathyaprakash, B. S. 1999, Phys. Rev. D, 60, 022002
  • Qi et al. (2013) Qi, C., Öberg, K. I., & Wilner, D. J. 2013, ApJ, 765, 34
  • Qi et al. (2004) Qi, C., Ho, P. T. P., Wilner, D. J., et al. 2004, ApJ, 616, L11
  • Ruffio et al. (2017) Ruffio, J.-B., Macintosh, B., Wang, J. J., et al. 2017, ApJ, 842, 14
  • Schutz (1999) Schutz, B. F. 1999, Classical and Quantum Gravity, 16, A131
  • Schwab (1984) Schwab, F. R. 1984, in Indirect Imaging. Measurement and Processing for Indirect Imaging, ed. J. A. Roberts, 333–346
  • Schwartz & Shaw (1975) Schwartz, M., & Shaw, L. 1975, Signal processing: discrete spectral analysis, detection, and estimation, McGraw-Hill Classic Textbook Reissue (McGraw-Hill). https://books.google.com/books?id=A-lSAAAAMAAJ
  • Serra et al. (2015) Serra, P., Westmeier, T., Giese, N., et al. 2015, MNRAS, 448, 1922
  • Thompson et al. (2017) Thompson, A. R., Moran, J. M., & Swenson, Jr., G. W. 2017, Interferometry and Synthesis in Radio Astronomy, 3rd Edition (Springer), doi:10.1007/978-3-319-44431-4
  • van der Walt et al. (2011) van der Walt, S., Colbert, S. C., & Varoquaux, G. 2011, Computing in Science Engineering, 13, 22
  • Vio & Andreani (2016) Vio, R., & Andreani, P. 2016, A&A, 589, A20
  • Walsh et al. (2014) Walsh, C., Millar, T. J., Nomura, H., et al. 2014, A&A, 563, A33
  • Walsh et al. (2016) Walsh, C., Loomis, R. A., Öberg, K. I., et al. 2016, ApJ, 823, L10
  • White & Padmanabhan (2017) White, M., & Padmanabhan, N. 2017, ArXiv e-prints, arXiv:1705.09669
  • Whiting & Humphreys (2012) Whiting, M., & Humphreys, B. 2012, PASA, 29, 371
  • Whiting (2012) Whiting, M. T. 2012, MNRAS, 421, 3242
  • Woodward (1953) Woodward, P. 1953, Probability and Information Theory: With Applications to Radar, Electronics and Waves No. v. 3 (Elsevier Science & Technology). https://books.google.com/books?id=RA0nAAAAMAAJ
  • Yen et al. (2016) Yen, H.-W., Koch, P. M., Liu, H. B., et al. 2016, ApJ, 832, 204
  • Zackay & Ofek (2017) Zackay, B., & Ofek, E. O. 2017, ApJ, 836, 187

Appendix A Calculating SNR boost for a matched filter

SNR (using the definition of signal-power/noise-power) can be written for an arbitrary signal 𝒔\bm{s} and filter 𝒉\bm{h} as:

SNR=𝒉𝒔𝒔𝒉𝒉𝑹𝒗𝒉.\mathrm{SNR}=\frac{\bm{h}^{*}\bm{s}\bm{s}^{*}\bm{h}}{\bm{h}^{*}\bm{R_{v}}\bm{h}}. (A1)

As discussed in §2.1, North (1963) showed that a linear matched filter of form:

𝒉=[1𝒔𝑹𝒗1𝒔]𝑹𝒗1𝒔=C𝑹𝒗1𝒔,\bm{h}=[\frac{1}{\sqrt{\bm{s}^{*}\bm{R_{v}}^{-1}\bm{s}}}]\bm{R_{v}}^{-1}\bm{s}=C\bm{R_{v}}^{-1}\bm{s}, (A2)

maximizes the output SNR. The natural question is then how much is the SNR boosted by applying such a filter? This can be analytically calculated for a given signal by comparing the SNR after applying a matched filter with the SNR from applying a flat filter 𝟏\bm{1} (e.g. a unity matrix of all ones). We start by calculating the SNR after applying the matched filter:

SNRmf=𝒉𝒔𝒔𝒉𝒉𝑹𝒗𝒉.\mathrm{SNR_{mf}}=\frac{\bm{h}^{*}\bm{s}\bm{s}^{*}\bm{h}}{\bm{h}^{*}\bm{R_{v}}\bm{h}}. (A3)

We then substitute for 𝒉\bm{h}, noting that that 𝑹𝒗1\bm{R_{v}}^{-1} is Hermitian and therefore 𝑹𝒗1=𝑹𝒗1\bm{R_{v}}^{{-1}^{*}}=\bm{R_{v}}^{-1}:

SNRmf=𝑹𝒗1𝒔𝒔𝒔𝑹𝒗1𝒔𝑹𝒗1𝒔𝑹𝒗𝑹𝒗1𝒔.\mathrm{SNR_{mf}}=\frac{\bm{R_{v}}^{-1}\bm{s}^{*}\bm{s}\bm{s}^{*}\bm{R_{v}}^{-1}\bm{s}}{\bm{R_{v}}^{-1}\bm{s}^{*}\bm{R_{v}}\bm{R_{v}}^{-1}\bm{s}}. (A4)

Under the assumption of uncorrelated noise (which is reasonable for the case of independent interferometric visibilities), there are no off-diagonal terms in 𝑹𝒗\bm{R_{v}} and we can reduce this equation to:

SNRmf=iNsi2Rii1,\mathrm{SNR_{mf}}=\sum_{i}^{N}\|s_{i}\|^{2}R_{ii}^{-1}, (A5)

where there are N elements of the signal 𝒔\bm{s} (which can be summed in multiple dimensions or flattened as shown here). Similarly, if we write the SNR of the flat filter as:

SNRflat=𝟏𝒔𝒔𝟏𝟏𝑹𝒗𝟏,\mathrm{SNR_{flat}}=\frac{\bm{1}^{*}\bm{s}\bm{s}^{*}\bm{1}}{\bm{1}^{*}\bm{R_{v}}\bm{1}}, (A6)

then we find it reduces to:

SNRflat=(iNsi)2tr[𝑹𝒗].\mathrm{SNR_{flat}}=\frac{(\sum_{i}^{N}\|s_{i}\|)^{2}}{\mathrm{tr}[\bm{R_{v}}]}. (A7)

So the ratio of these two SNRs, or the total SNR boost from a matched filter, is:

boost=SNRmfSNRflat=(iNsi2Rii1)tr[𝑹𝒗](iNsi)2.\mathrm{boost}=\frac{\mathrm{SNR_{mf}}}{\mathrm{SNR_{flat}}}=\frac{(\sum_{i}^{N}\|s_{i}\|^{2}R_{ii}^{-1})\mathrm{tr}[\bm{R_{v}}]}{(\sum_{i}^{N}\|s_{i}\|)^{2}}. (A8)

Appendix B Calculating SNR boost in comparison to image-plane measurements

The boost value in equation A8 can be analytically calculated for a given filter kernel, but is not particularly useful at this point as it has not been related to the image-plane SNRs discussed throughout the paper. Thus the fundamental problem is how to relate the visibilities to an image-plane SNR. We start by writing down the definition of SNR in the dirty image 𝑰𝑫\bm{I^{D}}, or the raw discrete Fourier transform of the visibilites (i.e. not deconvolved):

SNR=𝑰𝑫𝚫𝑰𝑫=kWkVk(kWk2σk2)1/2,\mathrm{SNR}=\frac{\bm{I^{D}}}{\bm{\Delta I^{D}}}=\frac{\displaystyle\sum_{k}W_{k}V_{k}}{(\displaystyle\sum_{k}W_{k}^{2}\sigma_{k}^{2})^{1/2}}, (B1)

where there are kk visibilities in the dataset, each with a source visibility contribution VkV_{k}, total weight WkW_{k} (including any taper weights, density weights, and the variance weights wkw_{k} discussed in the main text), and thermal noise σk\sigma_{k}. Notation is borrowed from Briggs (1995), which contains a detailed discussion of image-plane SNRs and their relation to the measured visibilities. If the data is gridded into cells {p,q}\{p,q\} and the noise properties of all visibilities within each cell are similar, an approximate form can be written:

SNR=p,qWpqVpq(p,qWpq2σpq2)1/2.\mathrm{SNR}=\frac{\displaystyle\sum_{p,q}W_{pq}V_{pq}}{(\displaystyle\sum_{p,q}W_{pq}^{2}\sigma_{pq}^{2})^{1/2}}. (B2)

In particular, we are interested in the SNR at a particular location in the dirty map, e.g. the peak pixel in a given channel. If this pixel is phase shifted to the map center, the SNR can be written as:

SNR(0,0)=p,qWpq|Vpq|(p,qWpq2σpq2)1/2.\mathrm{SNR}^{\prime}(0,0)=\frac{\displaystyle\sum_{p,q}W_{pq}|V_{pq}|}{(\displaystyle\sum_{p,q}W_{pq}^{2}\sigma_{pq}^{2})^{1/2}}. (B3)

If we consider a moment-0 map of a resolved source, however, the SNR at the center of the moment map is:

SNRmom0=c,p,qWcpqRe(Vcpq)(c,p,qWcpq2σcpq2)1/2,\mathrm{SNR_{mom0}}=\frac{\displaystyle\sum_{c,p,q}W_{cpq}\mathrm{Re}(V_{cpq})}{(\displaystyle\sum_{c,p,q}W_{cpq}^{2}\sigma_{cpq}^{2})^{1/2}}, (B4)

and only the projected real component of each visibility will contribute signal. The SNR will then decrease as the ratio of the emission size to the resolution element increases, as discussed in Crane & Napier (1986) and Yen et al. (2016). Compounding this issue, if the source has a strong spatio-kinematic signature and peak emission moves throughout the dirty map, then the projected real component will vary as a function of channel cc. Applying these properties to Equation A8, we can estimate the SNR boost of the matched filter over a peak moment-0 value as:

boost=SNRmfSNRmom0=(iNVi2Rii1)tr[𝑹𝒗](iNRe(Vi))2.\mathrm{boost}=\frac{\mathrm{SNR_{mf}}}{\mathrm{SNR_{mom0}}}=\frac{(\sum_{i}^{N}\|V_{i}\|^{2}R_{ii}^{-1})\mathrm{tr}[\bm{R_{v}}]}{(\sum_{i}^{N}\mathrm{Re}(V_{i}))^{2}}. (B5)

Aligning the signal in the image plane through pixel shifting and stacking (e.g. as in Matrà et al. (2015) or Yen et al. (2016)) is analogous to phase shifting the individual visibilities to the map center. Re(VpqV_{pq}) can then be replaced by |Vpq||V_{pq}|, and the SNR after applying a pixel shifting method is roughly:

SNRps=SNRmom0=c,p,qWcpq|Vcpq|(c,p,qWcpq2σcpq2)1/2.\mathrm{SNR_{ps}}=\mathrm{SNR^{\prime}_{mom0}}=\frac{\displaystyle\sum_{c,p,q}W_{cpq}|V_{cpq}|}{(\displaystyle\sum_{c,p,q}W_{cpq}^{2}\sigma_{cpq}^{2})^{1/2}}. (B6)

Returning to equation A8 and applying this logic, we can write the boost as:

boost=SNRmfSNRps=(iNVi2Rii1)tr[𝑹𝒗](iN|Vi|)2,\mathrm{boost}=\frac{\mathrm{SNR_{mf}}}{\mathrm{SNR_{ps}}}=\frac{(\sum_{i}^{N}\|V_{i}\|^{2}R_{ii}^{-1})\mathrm{tr}[\bm{R_{v}}]}{(\sum_{i}^{N}|V_{i}|)^{2}}, (B7)

which defines the additional benefit matched filtering provides over a pixel stacking approach.

Equations B4 and B6 can be applied to any visibility sampled filter kernel to calculate these boosts. We note, however, that due to the line-broadening of most astronomical signals, true phase alignment from a pixel shifting approach is not possible and therefore the boost formulas are only approximations.

Appendix C Data weights and noise covariance matrices

C.1 Interferometric data weights

In general, each visibility ViV_{i} in an interferometric data set corresponding to an antenna pair (m,n)(m,n) will have a characteristic variance σmn2\sigma_{mn}^{2}, which is often assumed to be dominated by the system noise (see e.g. Chap. 6 of Thompson et al., 2017). When the system noise dominates, σmn\sigma_{mn} can be written in units of Jy as:

σmn(Jy)=2knqncAeffTsys,mTsys,n2ΔνΔt×1026,\sigma_{mn}(Jy)=\frac{2k}{n_{q}n_{c}A_{eff}}\sqrt{\frac{T_{sys,m}T_{sys,n}}{2\Delta\nu\Delta t}}\times 10^{26}, (C1)

where kk is Boltzmann’s constant, nqn_{q} and ncn_{c} are the quantization and correlator efficiencies, AeffA_{eff} is the effective antenna area, Tsys,mT_{sys,m} and Tsys,nT_{sys,n} are system temperatures for antennas mm and nn, respectively, Δν\Delta\nu is the effective channel bandwidth, and Δt\Delta t is the integration time. For identical antennas and system temperatures, this can be simplified as:

σmn(Jy)=2kTsysnqncAeff12ΔνΔt×1026.\sigma_{mn}(Jy)=\frac{2kT_{sys}}{n_{q}n_{c}A_{eff}}\sqrt{\frac{1}{2\Delta\nu\Delta t}}\times 10^{26}. (C2)

The data weight for each visibility is then calculated as wi=1/σi2w_{i}=1/\sigma_{i}^{2}. For instruments which record channelized system temperatures (e.g. ALMA), the weights will also be channelized, and are recorded in CASA as a ‘weight spectrum’ for each visibility.555For more details on how weights are handled in different versions of CASA, see https://casa.nrao.edu/casadocs/casa-5.1.0/reference-material/data-weights.

C.2 Noise covariance matrices

As discussed in Section 2.1, a matched filter can be written as:

𝒉=[1𝒔𝑹𝒗1𝒔]𝑹𝒗1𝒔=C𝑹𝒗1𝒔,\bm{h}=[\frac{1}{\sqrt{\bm{s}^{*}\bm{R_{v}}^{-1}\bm{s}}}]\bm{R_{v}}^{-1}\bm{s}=C\bm{R_{v}}^{-1}\bm{s}, (C3)

where 𝑹𝒗\bm{R_{v}} is the noise covariance matrix. When the ncn_{c} channels in an interferometric dataset are fully independent, 𝑹𝒗\bm{R_{v}} can be written for an individual visibility ViV_{i} as a nc×ncn_{c}\times n_{c} diagonal matrix:

𝑹𝒗=[σ12σ22σnc2]\bm{R_{v}}=\begin{bmatrix}\sigma_{1}^{2}&&&\\ &\sigma_{2}^{2}&&\\ &&\ddots&\\ &&&\sigma_{n_{c}}^{2}\\ \end{bmatrix} (C4)

and 𝑹𝒗𝟏\bm{R_{v}^{-1}} is then:

𝑹𝒗𝟏=[1σ121σ221σnc2]\bm{R_{v}^{-1}}=\begin{bmatrix}\frac{1}{\sigma_{1}^{2}}&&&\\ &\frac{1}{\sigma_{2}^{2}}&&\\ &&\ddots&\\ &&&\frac{1}{\sigma_{n_{c}}^{2}}\\ \end{bmatrix} (C5)

i.e. a diagonal matrix initialized to the channelized data weights. When the weights are not channelized, or can be approximated as equal across channels (wjwijw_{j}\approx w_{i}~\forall~j), Rv1R_{v}^{-1} can be written as:

𝑹𝒗𝟏=wi𝑰𝒏𝒄,\bm{R_{v}^{-1}}=w_{i}\bm{I_{n_{c}}}, (C6)

where 𝑰𝒏𝒄\bm{I_{n_{c}}} is an nc×ncn_{c}\times n_{c} identity matrix.

C.2.1 Correlated channels

In the case of fully independent channels, the filter and normalization prefactor are simple to calculate due to the lack of off-diagonal elements in the noise covariance matrix. In practice, however, channels are often correlated. In particular, astronomical interferometric datasets are generally correlated due to the use of a window function (e.g. Hann, Hamming, etc.) to reduce the ringing effect introduced by the finite maximum lag time of the correlator hardware.666A full description of these effects and the choice of window function for ALMA can be found at https://safe.nrao.edu/wiki/pub/Main/ALMAWindowFunctions/Note_on_Spectral_Response_V2.pdf. As the Hann function is a popular choice of window function (and applied directly in the time domain for the ALMA data described in this paper), we derive here the appropriate noise covariances matrices for Hanning smoothed data. Similar results could be calculated for other choices of window function.

In the frequency/channel domain, the Hann window applied to an observation 𝒙\bm{x} can be written as:

xi=14xi1+12xi+14xi+1.x^{\prime}_{i}=\frac{1}{4}x_{i-1}+\frac{1}{2}x_{i}+\frac{1}{4}x_{i+1}. (C7)

For an observation which can be linearly decomposed into signal and additive white gaussian noise (𝒙=𝒔+𝒗\bm{x}=\bm{s}+\bm{v}), we can calculate the noise covariance matrix of the smoothed data, 𝑹𝒗\bm{R^{\prime}_{v}}, which now contains off-diagonal elements:

Rv[j,k]=E[vjvk],R^{\prime}_{v}[j,k]=E[v^{\prime}_{j}v^{\prime*}_{k}], (C8)

where

vj=14vj1+12vj+14vj+1.v^{\prime}_{j}=\frac{1}{4}v_{j-1}+\frac{1}{2}v_{j}+\frac{1}{4}v_{j+1}. (C9)

Noting that in the uncorrelated case

E[vjvk]={σj2,if j=k0,otherwiseE[v_{j}v^{*}_{k}]=\begin{cases}\sigma_{j}^{2},&\text{if }j=k\\ 0,&\text{otherwise}\end{cases} (C10)

we find that for diagonal elements of RvR^{\prime}_{v}, Eq. C23 reduces to

Rv[j,j]=σj1216+σj24+σj+1216,R^{\prime}_{v}[j,j]=\frac{\sigma_{j-1}^{2}}{16}+\frac{\sigma_{j}^{2}}{4}+\frac{\sigma_{j+1}^{2}}{16}, (C11)

and for the populated off-diagonal elements it reduces to

Rv[j,j±1]=σj28+σj±128,R^{\prime}_{v}[j,j\pm 1]=\frac{\sigma_{j}^{2}}{8}+\frac{\sigma_{j\pm 1}^{2}}{8}, (C12)

and

Rv[j,j±2]=σj±1216.,R^{\prime}_{v}[j,j\pm 2]=\frac{\sigma_{j\pm 1}^{2}}{16}., (C13)

Under the assumption that the channelized weights are all approximately equal for a given visibility (wjwijw_{j}\approx w_{i}~\forall~j), 𝑹𝒗\bm{R^{\prime}_{v}} can be written as

𝑹𝒗σi2[3814116143811611638141161438]=σ2i[12316231161612316231]=σ2i[𝑴]\bm{R^{\prime}_{v}}\approx\sigma_{i}^{2}\begin{bmatrix}\frac{3}{8}&\frac{1}{4}&\frac{1}{16}&&\\ \frac{1}{4}&\frac{3}{8}&\ddots&\ddots&\\ \frac{1}{16}&\ddots&\ddots&\ddots&\frac{1}{16}\\ &\ddots&\ddots&\frac{3}{8}&\frac{1}{4}\\ &&\frac{1}{16}&\frac{1}{4}&\frac{3}{8}\\ \end{bmatrix}=\sigma{{}^{\prime}_{i}}^{2}\begin{bmatrix}1&\frac{2}{3}&\frac{1}{6}&&\\ \frac{2}{3}&1&\ddots&\ddots&\\ \frac{1}{6}&\ddots&\ddots&\ddots&\frac{1}{6}\\ &\ddots&\ddots&1&\frac{2}{3}\\ &&\frac{1}{6}&\frac{2}{3}&1\\ \end{bmatrix}=\sigma{{}^{\prime}_{i}}^{2}~[\bm{M}] (C14)

where σ=i3/8σi\sigma{{}^{\prime}_{i}}=\sqrt{3/8}\sigma_{i}. As tasks such as statwt in CASA do not consider effective channel bandwidth, wi=1/σ2i{w^{\prime}}_{i}=1/\sigma{{}^{\prime}_{i}}^{2} is likely what will be reported as the data weights of the observations.777See https://casaguides.nrao.edu/index.php/DataWeightsAndCombination for details about the absolute accuracy of data weights in CASA. The assumption of equal weights across channels is reasonable when TsysT_{sys} is stable across channels, which will likely be true unless there were issues with data calibration or there are strong water lines in the data. Under this assumption, the matrix inversion only needs to be computed once:

𝑹𝒗𝟏1σ2i[𝑴]1=wi[𝑴]1.\bm{{R^{\prime}_{v}}^{-1}}\approx\frac{1}{\sigma{{}^{\prime}_{i}}^{2}}~[\bm{M}]^{-1}={w^{\prime}}_{i}~[\bm{M}]^{-1}. (C15)

When the data is initially Hanning smoothed, the edge channels are clipped. Thus for the purposes of preventing boundary effects during inversion, the covariance matrix can be treated as describing a infinitely wide dataset. In practice, we extend the matrix to be several times larger than ncn_{c}, and then window a nkn_{k} sized portion from the center of 𝑹𝒗𝟏\bm{{R^{\prime}_{v}}^{-1}}. In the unbinned case, however, 𝑹𝒗\bm{R^{\prime}_{v}} is an ill-conditioned matrix (the condition number is  >> 101010^{10} for a typical ALMA spectral window), making the matrix inversion numerically unstable. The issue can be avoided by channel binning the data by a factor of two, as discussed in the following section.

C.2.2 Averaged correlated channels

As Hanning smoothing inflates effective channel width by a factor of 8/3, it is very common to bin data across channels by a factor of 2; this is the default for ALMA observations. Here we calculate the appropriate covariance matrices for binning factors of 2, 3, and 4. A similar method can be followed for other binning factors, but we note that past a binning factor of 4, the covariance matrix and its inverse rapidly approach the uncorrelated case and the channel correlation can likely be neglected without significant adverse effect.

With a binning factor of 2 applied to data that has been Hanning smoothed, the data (indexed by channel in the averaged dataset) can be described as:

xi,bin×2=(14xi0.5+12xi+14xi+0.5)+(14xi+12xi+0.5+14xi+1)2=18xi0.5+38xi+38xi+0.5+18xi+1,x^{\prime}_{i,bin\times 2}=\frac{(\frac{1}{4}x_{i-0.5}+\frac{1}{2}x_{i}+\frac{1}{4}x_{i+0.5})+(\frac{1}{4}x_{i}+\frac{1}{2}x_{i+0.5}+\frac{1}{4}x_{i+1})}{2}=\frac{1}{8}x_{i-0.5}+\frac{3}{8}x_{i}+\frac{3}{8}x_{i+0.5}+\frac{1}{8}x_{i+1}, (C16)

and the noise component is therefore:

vi,bin×2=18vi0.5+38vi+38vi+0.5+18vi+1.v^{\prime}_{i,bin\times 2}=\frac{1}{8}v_{i-0.5}+\frac{3}{8}v_{i}+\frac{3}{8}v_{i+0.5}+\frac{1}{8}v_{i+1}. (C17)

Substituting this into Eq. C23 and applying Eq. C25, we find that for diagonal elements of 𝑹𝒗\bm{R^{\prime}_{v}}

Rv,bin×2[j,j]=σj0.5264+9σj264+9σj+0.5264+σj+1264,R^{\prime}_{v,bin\times 2}[j,j]=\frac{\sigma_{j-0.5}^{2}}{64}+\frac{9\sigma_{j}^{2}}{64}+\frac{9\sigma_{j+0.5}^{2}}{64}+\frac{\sigma_{j+1}^{2}}{64}, (C18)

and the only populated off-diagonal elements are

Rv,bin×2[j,j±1]=3σj±0.5264+3σj±1264.R^{\prime}_{v,bin\times 2}[j,j\pm 1]=\frac{3\sigma_{j\pm 0.5}^{2}}{64}+\frac{3\sigma_{j\pm 1}^{2}}{64}. (C19)

Assuming the weights are roughly equivalent across channels for a given visibility, 𝑹𝒗\bm{R^{\prime}_{v}} can be approximated as

𝑹𝒗,𝒃𝒊𝒏×𝟐σi2[516332332516332332516]=σ2i[131031013103101]=σ2i[𝑴𝒃𝒊𝒏×𝟐]\bm{R^{\prime}_{v,bin\times 2}}\approx\sigma_{i}^{2}\begin{bmatrix}\frac{5}{16}&\frac{3}{32}&&\\ \frac{3}{32}&\frac{5}{16}&\ddots&\\ &\ddots&\ddots&\frac{3}{32}\\ &&\frac{3}{32}&\frac{5}{16}\\ \end{bmatrix}=\sigma{{}^{\prime}_{i}}^{2}\begin{bmatrix}1&\frac{3}{10}&&\\ \frac{3}{10}&1&\ddots&\\ &\ddots&\ddots&\frac{3}{10}\\ &&\frac{3}{10}&1\\ \end{bmatrix}=\sigma{{}^{\prime}_{i}}^{2}~[\bm{M_{bin\times 2}}] (C20)

where σ=i5/16σi\sigma{{}^{\prime}_{i}}=\sqrt{5/16}\sigma_{i}. Correspondingly,

𝑹𝒗,𝒃𝒊𝒏×𝟐𝟏1σ2i[𝑴𝒃𝒊𝒏×𝟐]1=wi[𝑴𝒃𝒊𝒏×𝟐]1.\bm{{R^{\prime}_{v,bin\times 2}}^{-1}}\approx\frac{1}{\sigma{{}^{\prime}_{i}}^{2}}~[\bm{M_{bin\times 2}}]^{-1}={w^{\prime}}_{i}~[\bm{M_{bin\times 2}}]^{-1}. (C21)

where the weights wi{w^{\prime}}_{i} were calculated from the binned data using a task such as statwt. In contrast to the unbinned case, 𝑹𝒗\bm{R^{\prime}_{v}} is now tridiagonal and well-conditioned for inversion.

Repeating these calculations for binning factors of 3 and 4, we find that:

𝑹𝒗,𝒃𝒊𝒏×𝟑σi2[141241241412412414]=σ2i[11616116161]=σ2i[𝑴𝒃𝒊𝒏×𝟑]\bm{R^{\prime}_{v,bin\times 3}}\approx\sigma_{i}^{2}\begin{bmatrix}\frac{1}{4}&\frac{1}{24}&&\\ \frac{1}{24}&\frac{1}{4}&\ddots&\\ &\ddots&\ddots&\frac{1}{24}\\ &&\frac{1}{24}&\frac{1}{4}\\ \end{bmatrix}=\sigma{{}^{\prime}_{i}}^{2}\begin{bmatrix}1&\frac{1}{6}&&\\ \frac{1}{6}&1&\ddots&\\ &\ddots&\ddots&\frac{1}{6}\\ &&\frac{1}{6}&1\\ \end{bmatrix}=\sigma{{}^{\prime}_{i}}^{2}~[\bm{M_{bin\times 3}}] (C22)

where σ=i1/4σi\sigma{{}^{\prime}_{i}}=\sqrt{1/4}\sigma_{i},

𝑹𝒗,𝒃𝒊𝒏×𝟑𝟏1σ2i[𝑴𝒃𝒊𝒏×𝟑]1=wi[𝑴𝒃𝒊𝒏×𝟑]1,\bm{{R^{\prime}_{v,bin\times 3}}^{-1}}\approx\frac{1}{\sigma{{}^{\prime}_{i}}^{2}}~[\bm{M_{bin\times 3}}]^{-1}={w^{\prime}}_{i}~[\bm{M_{bin\times 3}}]^{-1}, (C23)

and

𝑹𝒗,𝒃𝒊𝒏×𝟒σi2[1364312831281364312831281364]=σ2i[132632613263261]=σ2i[𝑴𝒃𝒊𝒏×𝟒]\bm{R^{\prime}_{v,bin\times 4}}\approx\sigma_{i}^{2}\begin{bmatrix}\frac{13}{64}&\frac{3}{128}&&\\ \frac{3}{128}&\frac{13}{64}&\ddots&\\ &\ddots&\ddots&\frac{3}{128}\\ &&\frac{3}{128}&\frac{13}{64}\\ \end{bmatrix}=\sigma{{}^{\prime}_{i}}^{2}\begin{bmatrix}1&\frac{3}{26}&&\\ \frac{3}{26}&1&\ddots&\\ &\ddots&\ddots&\frac{3}{26}\\ &&\frac{3}{26}&1\\ \end{bmatrix}=\sigma{{}^{\prime}_{i}}^{2}~[\bm{M_{bin\times 4}}] (C24)

where σ=i13/64σi\sigma{{}^{\prime}_{i}}=\sqrt{13/64}\sigma_{i},

𝑹𝒗,𝒃𝒊𝒏×𝟒𝟏1σ2i[𝑴𝒃𝒊𝒏×𝟒]1=wi[𝑴𝒃𝒊𝒏×𝟒]1.\bm{{R^{\prime}_{v,bin\times 4}}^{-1}}\approx\frac{1}{\sigma{{}^{\prime}_{i}}^{2}}~[\bm{M_{bin\times 4}}]^{-1}={w^{\prime}}_{i}~[\bm{M_{bin\times 4}}]^{-1}. (C25)

Appendix D Filter kernel generation

The Keplerian filter kernel presented in Fig. 2 was calculated for the viewing geometry of the protoplanetary disk around T Tauri star TW Hya (with an inclination of 7°\degree and PA of 155°\degree; e.g., Qi et al., 2004; Andrews et al., 2012). The Keplerian velocity field is calculated as

v=GMr,v=\sqrt{\frac{GM_{*}}{r}}, (D1)

with an assumed stellar mass of 0.8 M (Hughes et al., 2011). Using this field, we computed the emitting region of the disk for channels with 0.2 km s-1 spacing.

The parametric model filter kernel was calculated from the ‘fiducial’ model in Walsh et al. (2016), where CH3OH around TW Hya is constrained to a vertical layer z/r <\textless 0.1 between radii of 30-100 AU. From this abundance structure, an emission profile was calculated for the 312-303 transition using LIME. As seen in Fig. 2, the emission from this model tapers radially due to decreasing column density and temperature, in contrast to the Keplerian mask which simply has a hard outer radius cutoff. Additionally, the CH3OH model has an inner disk depletion (as CH3OH is mainly formed through hydrogenation of CO on grain surfaces outside the CO snowline), which is not present in the simple Keplerian model.

Finally, the data-driven filter kernel was generated from observations of H2CO around TW Hya (Öberg et al., 2017). The data were imaged in CASA using CLEAN with natural weighting, yielding a high SNR image cube. After imaging, all noise below 3σ\sigma and any emission outside of a 3\arcsec radius were masked out, creating a mostly noiseless approximation of the true H2CO emission distribution. (u,v)(u,v) plane kernels were generated from each of these image cubes using vis_\textunderscoresample, as described in §2.2, §2.3, and §2.4.