This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Time Series Classification of Supraglacial Lakes Evolution over Greenland Ice Sheet

Emam Hossain1, Md Osman Gani1, Devon Dunmire2, Aneesh Subramanian3, Hammad Younas4 1Department of Information Systems, University of Maryland Baltimore County, USA
2Department of Earth and Environmental Sciences, KU Leuven, Belgium
3Department of Atmospheric and Oceanic Sciences, University of Colorado Boulder, USA
4St. John’s School, Houston, USA
Email: {emamh1, mogani}@umbc.edu, devon.dunmire@kuleuven.be, aneeshcs@colorado.edu, hammadyounas27@icloud.com
Abstract

The Greenland Ice Sheet (GrIS) has emerged as a significant contributor to global sea level rise, primarily due to increased meltwater runoff. Supraglacial lakes, which form on the ice sheet surface during the summer months, can impact ice sheet dynamics and mass loss; thus, better understanding these lakes’ seasonal evolution and dynamics is an important task. This study presents a computationally efficient time series classification approach that uses Gaussian Mixture Models (GMMs) of the Reconstructed Phase Spaces (RPSs) to identify supraglacial lakes based on their seasonal evolution: 1) those that refreeze at the end of the melt season, 2) those that drain during the melt season, and 3) those that become buried, remaining liquid insulated a few meters beneath the surface. Our approach uses time series data from the Sentinel-1 and Sentinel-2 satellites, which utilize microwave and visible radiation, respectively. Evaluated on a GrIS-wide dataset, the RPS-GMM model, trained on a single representative sample per class, achieves 85.46% accuracy with Sentinel-1 data alone and 89.70% with combined Sentinel-1 and Sentinel-2 data. This performance significantly surpasses existing machine learning and deep learning models which require a large training data. The results demonstrate the robustness of the RPS-GMM model in capturing the complex temporal dynamics of supraglacial lakes with minimal training data.

I Introduction

The Greenland Ice Sheet (GrIS) contributes substantially to rising sea levels and can raise the sea level by more than 7 meters if melted entirely [1]. The GrIS has been losing mass annually, and since 1992, it has been estimated to contribute nearly 14 mm to global sea level rise [2]. The ice sheet loses mass through dynamic (speed-up of ice flow) and surface (meltwater runoff) processes. Recent studies have indicated that surface and meltwater processes have become the predominant contributor to GrIS mass loss [3], highlighting the increasingly important role of GrIS surface melt. Figure 1 illustrates the growing number of cumulative melting days over the GrIS in the years 1981, 2001, and 2021, indicating a substantial increase in melting in recent decades.

Supraglacial lakes are meltwater features that form on the ice sheet surface during the summer months. Understanding how these features evolve throughout the melt season is important because of their potential impact on ice sheet dynamics and mass loss [4]. Several things can happen to supraglacial lakes throughout a melt season. Many lakes refreeze at the end of the melt season as temperatures drop below 0C{}^{\circ}C. This refreezing creates impermeable ice layers, alters firn (the partially compacted snow layer) density, reduces firn air content, and affects future meltwater percolation and storage [5, 6]. Some lakes, however, do not refreeze entirely and remain liquid buried a few meters underneath the ice sheet surface [7, 8]. These features, called buried lakes, may temporarily store meltwater and reduce immediate runoff and mass loss, but their long-term impact on the ice sheet is still relatively unknown. Numerous supraglacial lakes also drain throughout the melt season, sometimes slowly via overflow drainage and sometimes rapidly via hydrofracture. Hydrofracture occurs when meltwater activates or extends fractures in the ice [9], and creates hydrologic pathways from the ice sheet surface to the bedrock, creating a means for surface meltwater to impact basal friction and ice velocity [10, 11].

Refer to caption
Figure 1: Cumulative melting days over the GrIS in 1981, 2001, and 2021. Source: National Snow and Ice Data Center (NSIDC) [12]

Numerous studies have used optical and near-infrared imagery from satellites such as Landsat and Sentinel-2 to identify and monitor supraglacial lakes and channels [13, 14, 15, 16]. Other works have taken advantage of microwave imagery, which overcomes some of the limitations of optical imagery during the polar night and cloudy conditions. Further, microwave imagery can be used to detect and monitor buried lakes [7, 17, 18], as microwaves can penetrate several meters beneath the ice surface [19]. In this study, we combine both optical and microwave imagery to introduce a computationally efficient time series classification method where we fit Gaussian Mixture Models (GMMs) to Reconstructed Phase Space (RPS) of different lakes classification using Maximum Likelihood Estimation (MLE). We classify supraglacial lakes according to their seasonal changes: 1) lakes that refreeze completely in winter, 2) those that drain either slowly or quickly during summer, and 3) those that get buried under ice or snow during winter. These classifications are crucial for understanding how meltwater is stored and released, influencing ice flow and sea level rise (Section III-D). Analyzing data from 777 lakes across all six sub-regions of the GrIS for the years 2018 and 2019, our approach classifies lakes using only one representative sample per class. This method is computationally more efficient and provides more accurate classifications compared to existing machine learning and deep learning models for time series classifications (Section II-B).

II Related Works

II-A Supraglacial Lakes

Understanding the hydrologic processes of the GrIS and their implications for its mass balance has been a subject of significant research interest in recent years. This section provides a comprehensive overview of the related works in this domain, encompassing studies on the formation and behavior of supraglacial lakes, their impact on ice dynamics, and the methods used for their detection and monitoring.

Previous studies, such as those by [20] and [21], investigated the evolution of supraglacial lakes and their dependence on topographic features, highlighting the influence of bed topography on their formation and persistence. [22] further explored the seasonal variability of supraglacial lakes, emphasizing their significance in understanding ice sheet dynamics. The formation and drainage of supraglacial lakes have profound implications for ice dynamics and mass loss from the GrIS. [3] demonstrated a shift in the predominant cause of mass loss from ice discharge to meltwater runoff, underscoring the increasing significance of surface melt in the ice sheet’s mass balance. [6] investigated the role of refrozen meltwater in altering firn properties and influencing future meltwater percolation, highlighting the complex feedback mechanisms between surface melt and ice dynamics. Furthermore, studies by [4] and [17] examined supraglacial lakes’ seasonal variability and spatial distribution, providing insights into the factors influencing their formation and evolution. [23] and [24] investigated the role of supraglacial lake drainage in modulating ice sheet dynamics, highlighting the potential for abrupt changes in ice flow and mass loss.

Satellite-based and remote sensing techniques have emerged as valuable tools for detecting and monitoring supraglacial lakes on the GrIS. [25] and [26] utilized multispectral satellite imagery, including data from MODIS (Moderate Resolution Imaging Spectroradiometer) and Landsat satellites, to identify supraglacial lakes and channels, enhancing the spatial coverage of surface meltwater features. [27] explored the use of Sentinel-2 imagery for automated detection of supraglacial lakes, highlighting the potential for improving temporal resolution and monitoring capabilities. In addition to optical imagery, SAR imagery has been increasingly employed to detect supraglacial lakes and buried lakes on the GrIS. [26] demonstrated the utility of SAR for detecting supraglacial lakes, while [18] utilized SAR imagery to identify perennial supraglacial lakes. [17] and [7] further investigated the use of SAR for detecting and monitoring buried lakes, highlighting its capability to penetrate snow and ice.

II-B Machine Learning and Deep Learning

There are several established machine learning (ML) and deep learning (DL) models widely used for time series classification tasks. Long Short-Term Memory (LSTM) networks and Fully Convolutional Networks (FCN) are among the prominent models used in this domain. Residual Networks (ResNet) and Recurrent Neural Networks (RNNs) are also integral to this field, offering unique time series analysis capabilities. The following defines some popular classification techniques that are selected based on their proven effectiveness and unique methodologies for handling time series data.

LSTMFCNClassifier: Combines LSTM and FCN to leverage both sequential dependencies and local feature extraction.

FCNClassifier: Utilizes FCN to capture local patterns in time series data without fully connected layers, allowing it to handle input sequences of varying lengths.

ResNetClassifier: Employs ResNet with residual connections to facilitate the training of deep networks, effectively capturing complex patterns in the data.

SimpleRNNClassifier: Uses RNNs with recurrent connections to capture dependencies over time, making it suitable for time series classification.

KNeighborsTimeSeriesClassifier: A non-parametric, instance-based learning algorithm that classifies data points based on the majority class among their kk-nearest neighbors in the feature space.

III Background

This section outlines the essential components of the RPS-GMM model. We provide an overview of reconstructed phase space and Gaussian mixture models and discuss the maximum likelihood classifier and the evolution of various supraglacial lakes over time.

III-A Reconstructed Phase Space

The seasonal evolution of supraglacial lakes can be viewed as a dynamic system where their state changes over time due to various environmental factors. The Reconstructed Phase Space (RPS) is particularly suitable for this research as it enables the capture of underlying dynamics from time series data, such as remote satellite observations. A dynamical system describes the temporal evolution of a system to capture its dynamics. A phase space represents all possible states of the system that evolve, and the dynamics map describes how the system evolves. The RPS captures the underlying dynamics of a system from time series observations, allowing for the analysis of complex and non-linear behaviors in supraglacial lakes.

According to Takens’ embedding theorem, a time series x={xn},n=1,,Nx=\{x_{n}\},n=1,\ldots,N can be converted into state vectors using time-delay embedding:

Xn=[xn,xnτ,,xn(d1)τ],X_{n}=[x_{n},x_{n-\tau},\ldots,x_{n-(d-1)\tau}], (1)

where τ\tau is the time delay and dd is the embedding dimension [28]. This embedding reconstructs the state and dynamics of the unknown system from observed measurements. The dimension dd should be greater than twice the box-counting dimension of the original system [29]. If dd is unknown, it can be estimated using the false nearest-neighbor techniques, and τ\tau can be determined by finding the first minimum of the automutual information [30].

III-B Gaussian Mixture Models

Gaussian Mixture Models (GMMs) are employed to model the distribution of dynamics represented by the RPS. The seasonal evolution of supraglacial lakes involves complex, non-linear behaviors that are well-captured by GMMs, which can model multiple underlying distributions within the data. A GMM is a weighted sum of MM Gaussian distributions:

p(Xλ)=i=1Mwi𝒩(Xμi,Σi),p(X\mid\lambda)=\sum_{i=1}^{M}w_{i}\mathcal{N}(X\mid\mu_{i},\Sigma_{i}), (2)

where λ={wi,μi,Σi}\lambda=\{w_{i},\mu_{i},\Sigma_{i}\} are the weights, means, and covariance matrices of the mixture components [31]. These parameters of the GMM are estimated using the Expectation-Maximization (EM) algorithm, which iteratively maximizes the likelihood of the data [32]. The EM algorithm includes two main steps:

1. Expectation (E-step): Calculate the responsibilities for each data point XiX_{i}:

γij=wj𝒩(Xiμj,Σj)k=1Mwk𝒩(Xiμk,Σk).\gamma_{ij}=\frac{w_{j}\mathcal{N}(X_{i}\mid\mu_{j},\Sigma_{j})}{\sum_{k=1}^{M}w_{k}\mathcal{N}(X_{i}\mid\mu_{k},\Sigma_{k})}. (3)

2. Maximization (M-step): Update the parameters using the responsibilities:

wj=1Ni=1Nγij,w_{j}=\frac{1}{N}\sum_{i=1}^{N}\gamma_{ij}, (4)
μj=i=1NγijXii=1Nγij,\mu_{j}=\frac{\sum_{i=1}^{N}\gamma_{ij}X_{i}}{\sum_{i=1}^{N}\gamma_{ij}}, (5)
Σj=i=1Nγij(Xiμj)(Xiμj)Ti=1Nγij.\Sigma_{j}=\frac{\sum_{i=1}^{N}\gamma_{ij}(X_{i}-\mu_{j})(X_{i}-\mu_{j})^{T}}{\sum_{i=1}^{N}\gamma_{ij}}. (6)

The E-step and M-step are repeated until the parameters converge, ensuring the GMM optimally fits the data.

III-C Maximum Likelihood Classifier

A Bayesian maximum likelihood classifier is used to classify the test data. For each test point XkX_{k}, the likelihoods are computed for each model aia_{i}:

p(Xai)=k=1Tp(xkai),p(X\mid a_{i})=\prod_{k=1}^{T}p(x_{k}\mid a_{i}), (7)

where X={x1,x2,,xT}X=\{x_{1},x_{2},\ldots,x_{T}\} is the sequence of observations. By leveraging the likelihoods derived from the GMM, the classifier can effectively distinguish between the different types of lakes. After computing all the likelihoods, the class with the maximum likelihood, a^\hat{a} (i.e., predicted class), is determined using the following equation [33]:

a^=argmaxip(Xai)\hat{a}=\arg\max_{i}p(X\mid a_{i}) (8)

III-D Evolution of Supraglacial Lakes

We observe distinct evolutionary changes in supraglacial lakes from satellite data, specifically using the Sentinel-1 (S1, microwave) and Sentinel-2 (S2, optical) satellites. We classify lakes into three distinct categories: refreezing lakes, draining lakes, and buried lakes. From the S1 microwave imagery, we use the horizontally-transmitted, vertically-received (HV) band, previously used for buried lake detection [7, 17]. For each S1 image, we calculate the average HV value both within the lake outline and in the immediate vicinity outside the lake bounds, within a 750\simm buffer of the lake. Figure 2 illustrates an average backscatter timeseries within the lake bounds (blue line, HVlakeHV_{lake}) and from the area surrounding the lake (purple line, HVbackgroundHV_{background}). By differencing these two backscatter signals using Equation 9, we can identify the backscatter anomaly for a given lake, denoted as HVanomHV_{anom}.

Refer to caption
Figure 2: Comparison of backscatter signal received from within a lake (blue) and its vicinity (purple).
Refer to caption
(a) Refreeze lake
Refer to caption
(b) Buried lake
Refer to caption
(c) Drained lake
Figure 3: Evolution of different types of supraglacial lakes over time. The grey line shows the time series of backscatter differences (left y-axis) and the red dots represent the percentage of water coverage from each S2 image observation (right y-axis). Black-and-white and color images are from Sentinel-1 and Sentinel-2 satellites, respectively.

We also obtain a time series with information from optical imagery for each lake. From each S2 optical image, we determine the water percentage (pwaterp_{water}) inside the lake by calculating the ratio of water-identified pixels to the total number of pixels within the lake. This calculation is detailed in Equation 10. NwaterN_{water} represents the number of pixels identified as water in the S2 image, and NtotalN_{total} is the total number of pixels within the lake. Figure 3 illustrates an example time series for all three lake types.

HVanom=HVlakeHVbackgroundHV_{anom}=HV_{lake}-HV_{background} (9)
pwater=NwaterNtotal×100%p_{water}=\frac{N_{water}}{N_{total}}\times 100\% (10)

Refreeze lakes form when temperatures decrease towards the end of the melt season, causing the lakes to transition from liquid to frozen. Figure 3(a) illustrates that both HVanomHV_{anom} and pwaterp_{water} decline at the end of the summer and approach zero in the fall, indicating the lake water has fully refrozen.

The optical timeseries for a buried lake appears similar to that of a refreeze lake, as pwaterp_{water} decreases to zero at the end of the melt season. However, microwave images and the backscatter HVanomHV_{anom} timeseries in Figure 3(b) indicate that some liquid water remains buried beneath the surface, even at the end of the year. Water strongly absorbs microwave radiation and thus areas with liquid water presence appear relatively dark in the microwave imagery.

Drained lakes undergo a significant reduction in water volume, typically marked by a sharp decline in pwaterp_{water}. We classify lakes as draining if the lake water is drained to the ice bed and pwaterp_{water} approaches zero during the melt season. Figure 3(c) depicts the optical and microwave timeseries of a lake experiencing rapid drainage, with imagery indicating that the lake was drained over the four days between June 01-05.

IV Experiments

IV-A Data Collection and Preprocessing

For this study, we utilize a comprehensive pan-Greenland dataset [7, 34], which includes detailed outlines of supraglacial lakes with a resolution of 30 meters. This dataset consists of 3,846 supraglacial lakes during the 2018 melt season and 6,146 supraglacial lakes during 2019, each with a surface area >> 0.05 km2km^{2}. The years 2018 and 2019 are chosen to capture a range of climatic conditions, with 2018 representing a cooler melt season and 2019 representing a warmer melt season [35, 36], providing a comprehensive understanding of supraglacial lake dynamics under different temperature regimes. For this work, we manually label the time series of 777 lakes into three classes: refreeze (189 lakes), drained (392 lakes), and buried (196 lakes). The dataset spans all six subregions of the Greenland Ice Sheet: Northeast (NE), Northwest (NW), North (NO), Central West (CW), Southeast (SE), and Southwest (SW) [37]. This dataset is selected for its ice-sheet-wide coverage and high spatial resolution.

We utilize satellite imagery from two sources: microwave imagery (S1 satellite) and optical imagery (S2 satellite). Data is acquired using the Google Earth Engine (GEE) [38]. S1 imagery is already preprocessed within GEE by applying thermal noise removal, radiometric calibration, terrain correction, and conversion to decibels via log scaling. For each supraglacial lake identified in 2018 and 2019, the dataset includes S1 imagery from January 1 to December 31 of the respective year. For optical imagery, we use S2 Level-1C orthorectified top-of-atmosphere reflectance, specifically Bands 2 (Blue, 20 m), 3 (Green, 20 m), 4 (Red, 20 m), 10 (Cirrus, 60 m), and 11 (SWIR 1, 20 m). To generate a complete annual time series of HVanomHV_{anom} and pwaterp_{water} for each lake, we linearly interpolated between S1 and S2 observations and applied a 12-day smoothing filter to HVanomHV_{anom} to reduce variability across different S1 orbits. Our analysis focuses on time series data from May 1 to December 31 to capture the seasonal dynamics of supraglacial lakes during the melt season.

IV-B Methodology

In this study, we classify supraglacial lakes into three categories: refreeze, drain, and buried using Reconstructed Phase Space (RPS) and Gaussian Mixture Models (GMM). The methodology is divided into two main phases: training and testing. The training phase begins by selecting one representative sample for each class. We then construct the RPS for these samples using time lag (τ\tau) and embedding dimension (dd). Instead of using false nearest neighbor or automutual information methods, we employ a grid search technique to select τ\tau and dd from a range of [2, 30]. For each combination of τ\tau and dd, we construct the RPS for the representative samples of each class. We then initialize GMMs for each class with 10 mixtures. For each GMM, we use kk-means clustering to generate 10 different sets of initial parameters and select the best-performing set. Subsequently, we train the GMM models M={M1,M2,M3}M=\{M_{1},M_{2},M_{3}\} for each class—refreeze, drain, and buried—using the EM algorithm. The trained GMMs serve as probabilistic models for each class of supraglacial lakes. This approach is computationally efficient, requiring only one representative sample per class, which reduces the computational burden compared to other machine learning models that need larger training datasets.

Algorithm 1 Classifying Supraglacial Lakes using RPS-GMM
1:  Set of supraglacial lake classes, C={C1,C2,C3}={refreeze,drain,buried}C=\{C_{1},C_{2},C_{3}\}=\{refreeze,drain,buried\}
2:  Input: Time series data DD of backscatter difference HVanom(t)HV_{anom}(t) and water percentage pwater(t)p_{water}(t)
3:  Output: Classification of supraglacial lakes into {C1,C2,C3}\{C_{1},C_{2},C_{3}\}
4:  Define time series of HVanom(t)HV_{anom}(t) and pwater(t)p_{water}(t)
5:  Select one representative sample sCis_{C_{i}} where CiCC_{i}\in C
6:  Initialize grid search for time delay τ\tau and embedding dimension dd
7:  for each combination of τ\tau and dd do
8:     Construct RPS for sCis_{C_{i}} where CiCC_{i}\in C using τ\tau and dd
9:     Initialize GMM with 10 mixtures
10:     Train a set of GMM models M={M1,M2,M3}M=\{M_{1},M_{2},M_{3}\} using the EM algorithm on sCis_{C_{i}} where CiCC_{i}\in C
11:     for each instance iDi\in D do
12:        Construct RPS for ii using τ\tau and dd
13:        Compute likelihood score SiS_{i} of ii for each MjM_{j}
14:        Assign predicted class based on the maximum likelihood score in SiS_{i}
15:     end for
16:     Calculate and store accuracy for τ\tau and dd
17:  end for
18:  Select optimal τ\tau^{*} and dd^{*} with the maximum accuracy
19:  Return: Class labels for DD with optimal τ\tau^{*} and dd^{*}

During the testing phase, we create the RPS for each instance in the dataset using the current combination of τ\tau and dd in the grid search. Each instance’s RPS is evaluated against the trained GMMs, and the likelihood score is computed for each GMM. Using the maximum likelihood classifier, the class with the highest likelihood score is assigned to the test instance, classifying each supraglacial lake as either refreeze, drain, or buried.

To evaluate the performance of our model, we calculate accuracy for each combination of τ\tau and dd during the grid search. The optimal values τ\tau^{*} and dd^{*} are chosen based on the highest accuracy achieved. A step-by-step outline of the proposed methodology is provided in Algorithm 1. The dataset and the code are available on GitHub111https://github.com/ehfahad/TSC-of-Supraglacial-Lakes-Evolution-over-GrIS.

V Results and Discussion

This section presents the evaluation results of the RPS-GMM model for classifying supraglacial lakes and compares its performance with several established ML and DL models for time series classification.

V-A RPS-GMM Performance Evaluation

The RPS-GMM model is trained with two sets of features: HVanomHV_{anom} alone and a combination of HVanomHV_{anom} and pwaterp_{water}. The reason for having two sets of features is that we aim to evaluate the impact of including the water percentage feature on the model’s performance. Our dataset includes time series data from 777 lakes, categorized into 189 refreeze, 392 drained, and 196 buried lakes, across all six subregions of the GrIS for the years 2018 and 2019.

AccuracyPrecisionRecallF1 Score0202040406060808010010085.4685.4685.5485.5485.0385.0385.1385.1389.789.789.5989.5989.6289.6289.5389.53Performance (%)RPS-GMM (HVanomHV_{anom} only)RPS-GMM (HVanomHV_{anom} + pwaterp_{water})
Figure 4: Performance of the RPS-GMM models

As shown in Figure 4, the model trained with only the backscatter difference achieves an accuracy of 85.46%. Incorporating the water percentage substantially improves the accuracy to 89.70%. This improvement underscores the value of adding water percentage with the backscatter signal. Despite an imbalanced dataset, the weighted averages of precision, recall, and F1 score also reflect better performance across all classes when incorporated pwaterp_{water} with HVanomHV_{anom}.

V-B Comparison against Existing ML/DL Models

To further validate the RPS-GMM model, we compare its performance with several established ML and DL models using the sktime time series package [39]. Each model is evaluated through 55-fold cross-validation, with 80% of the data (\sim622 samples) used for training and 20% for testing in each fold.

TABLE I: Accuracy comparison against ML/DL models
Model 𝐇𝐕𝐚𝐧𝐨𝐦\mathbf{HV_{anom}} 𝐇𝐕𝐚𝐧𝐨𝐦+𝐩𝐰𝐚𝐭𝐞𝐫\mathbf{HV_{anom}}+\mathbf{p_{water}}
LSTMFCNClassifier 84% 87%
FCNClassifier 51% 43%
ResNetClassifier 57% 45%
SimpleRNNClassifier 50% 49%
KNeighborsTimeSeriesClassifier 80% 75%
RPS-GMM 85.46% 89.70%

The results, detailed in Table I, show that the RPS-GMM model outperforms all compared ML and DL models in terms of accuracy. For example, the highest accuracy among existing models using only the backscatter difference is 84%, by the LSTMFCNClassifier. When incorporating both features, most models experience decreased performance due to the added complexity and limited training sample size. The LSTMFCNClassifier shows a slight improvement, possibly due to its inherent ability to capture long-term temporal dependencies, achieving 87% accuracy compared to the RPS-GMM model’s 89.70%.

V-C Discussion

The superior performance of the RPS-GMM model, especially when including the water percentage feature, highlights the effectiveness of a comprehensive feature set for accurately classifying supraglacial lakes. Notably, the RPS-GMM model, trained on only one representative sample per class, significantly outperforms models trained on 80% of the data in 55-fold cross-validation. This efficiency demonstrates the model’s robustness in capturing supraglacial lake dynamics with minimal training data. In contrast, existing ML and DL models showed varying effectiveness, with deep learning models generally underperforming, mainly when both features were included. This underperformance may be due to the complexity of the time series data and the relatively small dataset size, which might not be sufficient for effective training of deep learning models.

VI Conclusion

Identifying which supraglacial lakes refreeze, drain, or get buried is crucial for understanding their role in ice sheet dynamics and assessing their impact on global sea level rise. Accurate monitoring of these lakes helps evaluate meltwater dynamics and their implications for the Greenland Ice Sheet (GrIS) mass balance. This study presents a computationally efficient time series classification approach for supraglacial lakes using Reconstructed Phase Space (RPS) and Gaussian Mixture Models (GMM). By integrating time series data of backscatter difference (HVanomHV_{anom}) from Sentinel-1 and water percentage (pwaterp_{water}) from Sentinel-2, we demonstrated that incorporating multiple features significantly enhances classification accuracy.

Our results show that the RPS-GMM model, when incorporating both HVanomHV_{anom} and pwaterp_{water}, achieved an accuracy of 89.70%, outperforming the model trained with only HVanomHV_{anom}, which attained 85.46%. This improvement emphasizes the value of combining diverse features to capture the complex dynamics of supraglacial lake evolution. Comparative analysis with established machine learning and deep learning models further underscores the robustness of the RPS-GMM model, which consistently exceeded the performance of these methods. Specifically, while the inclusion of pwaterp_{water} improved the RPS-GMM model’s accuracy, the performance of other models declined due to increased complexity and insufficient training data.

The RPS-GMM model demonstrates significant computational efficiency by training on only a single representative sample per class, substantially reducing the computational burden compared to traditional machine learning and deep learning models, which require extensive datasets. This efficiency underscores the model’s effectiveness as a tool for monitoring and analyzing supraglacial lakes, which is crucial for understanding meltwater dynamics and their impact on the GrIS’s mass balance. To further enhance this methodology, future research could integrate additional features, such as temperature and surface elevation changes, and explore how these features may causally impact the evolution of the supraglacial lakes [40]. This could provide deeper insights into polar hydrology and help build more robust and interpretable machine learning models.

Acknowledgement

This work is supported by iHARP: NSF HDR Institute for Harnessing Data and Model Revolution in the Polar Regions (Award# 2118285). The views expressed in this work do not necessarily reflect the policies of the NSF, and endorsement by the Federal Government should not be inferred.

References

  • [1] B. Smith, H. A. Fricker, A. S. Gardner, B. Medley, J. Nilsson, F. S. Paolo, N. Holschuh, S. Adusumilli, K. Brunt, B. Csatho, K. Harbeck, T. Markus, T. Neumann, M. R. Siegfried, and H. J. Zwally, “Pervasive ice sheet mass loss reflects competing ocean and atmosphere processes,” Science, vol. 368, no. 6496, pp. 1239–1242, 2020. [Online]. Available: https://www.science.org/doi/abs/10.1126/science.aaz5845
  • [2] I. N. Otosaka, M. Horwath, R. Mottram, and S. Nowicki, “Mass balances of the antarctic and greenland ice sheets monitored from space,” Surveys in Geophysics, vol. 44, no. 5, pp. 1615–1652, 2023.
  • [3] M. R. van den Broeke, E. M. Enderlin, I. M. Howat, P. Kuipers Munneke, B. P. Y. Noël, W. J. van de Berg, E. van Meijgaard, and B. Wouters, “On the recent contribution of the greenland ice sheet to sea level change,” The Cryosphere, vol. 10, no. 5, pp. 1933–1946, 2016. [Online]. Available: https://tc.copernicus.org/articles/10/1933/2016/
  • [4] V. W. Chu, “Greenland ice sheet hydrology: A review,” Progress in Physical Geography, vol. 38, no. 1, pp. 19–54, 2014.
  • [5] H. Machguth, M. MacFerrin, D. van As, J. E. Box, C. Charalampidis, W. Colgan, R. S. Fausto, H. A. Meijer, E. Mosley-Thompson, and R. S. van de Wal, “Greenland meltwater storage in firn limited by near-surface ice formation,” Nature Climate Change, vol. 6, no. 4, pp. 390–393, 2016.
  • [6] M. MacFerrin, H. Machguth, D. v. As, C. Charalampidis, C. M. Stevens, A. Heilig, B. Vandecrux, P. L. Langen, R. Mottram, X. Fettweis et al., “Rapid expansion of greenland’s low-permeability ice slabs,” Nature, vol. 573, no. 7774, pp. 403–407, 2019.
  • [7] D. Dunmire, A. F. Banwell, N. Wever, J. Lenaerts, and R. T. Datta, “Contrasting regional variability of buried meltwater extent over 2 years across the greenland ice sheet,” The Cryosphere, vol. 15, no. 6, pp. 2983–3005, 2021.
  • [8] L. S. Koenig, D. Lampkin, L. Montgomery, S. Hamilton, J. Turrin, C. Joseph, S. Moutsafa, B. Panzer, K. Casey, J. D. Paden et al., “Wintertime storage of water in buried supraglacial lakes across the greenland ice sheet,” The Cryosphere, vol. 9, no. 4, pp. 1333–1342, 2015.
  • [9] S. B. Das, I. Joughin, M. D. Behn, I. M. Howat, M. A. King, D. Lizarralde, and M. P. Bhatia, “Fracture propagation to the base of the greenland ice sheet during supraglacial lake drainage,” Science, vol. 320, no. 5877, pp. 778–781, 2008.
  • [10] H. J. Zwally, W. Abdalati, T. Herring, K. Larson, J. Saba, and K. Steffen, “Surface melt-induced acceleration of greenland ice-sheet flow,” Science, vol. 297, no. 5579, pp. 218–222, 2002.
  • [11] M. Hoffman, G. Catania, T. Neumann, L. Andrews, and J. Rumrill, “Links between acceleration, melting, and supraglacial lake drainage of the western greenland ice sheet,” Journal of Geophysical Research: Earth Surface, vol. 116, no. F4, 2011.
  • [12] N. Snow and I. D. C. (NSIDC), “Greenland surface melting in 2021,” https://nsidc.org/ice-sheets-today/analyses/greenland-surface-melting-2021, 2021, accessed: 2024-07-23.
  • [13] K. E. Miles, I. C. Willis, C. L. Benedek, A. G. Williamson, and M. Tedesco, “Toward monitoring surface and subsurface lakes on the greenland ice sheet using sentinel-1 sar and landsat-8 oli imagery,” Frontiers in Earth Science, vol. 5, p. 251152, 2017.
  • [14] A. G. Williamson, A. F. Banwell, I. C. Willis, and N. S. Arnold, “Dual-satellite (sentinel-2 and landsat 8) remote sensing of supraglacial lakes in greenland,” The Cryosphere, vol. 12, no. 9, pp. 3045–3065, 2018.
  • [15] J. Hu, H. Huang, Z. Chi, X. Cheng, Z. Wei, P. Chen, X. Xu, S. Qi, Y. Xu, and Y. Zheng, “Distribution and evolution of supraglacial lakes in greenland during the 2016–2018 melt seasons,” Remote Sensing, vol. 14, no. 1, p. 55, 2021.
  • [16] M. Dømgaard, K. Kjeldsen, P. How, and A. Bjørk, “Altimetry-based ice-marginal lake water level changes in greenland,” Communications Earth & Environment, vol. 5, no. 1, p. 365, 2024.
  • [17] C. L. Benedek and I. C. Willis, “Winter drainage of surface lakes on the greenland ice sheet from sentinel-1 sar imagery,” The Cryosphere, vol. 15, no. 3, pp. 1587–1606, 2021.
  • [18] L. Schröder, N. Neckel, R. Zindler, and A. Humbert, “Perennial supraglacial lakes in northeast greenland observed by polarimetric sar,” Remote Sensing, vol. 12, no. 17, p. 2798, 2020.
  • [19] E. Rignot, K. Echelmeyer, and W. Krabill, “Penetration depth of interferometric synthetic-aperture radar signals in snow and ice,” Geophysical Research Letters, vol. 28, no. 18, pp. 3501–3504, 2001.
  • [20] L. C. Smith, V. W. Chu, K. Yang, C. J. Gleason, L. H. Pitcher, A. K. Rennermalm, C. J. Legleiter, A. E. Behar, B. T. Overstreet, S. E. Moustafa et al., “Efficient meltwater drainage through supraglacial streams and rivers on the southwest greenland ice sheet,” Proceedings of the National Academy of Sciences, vol. 112, no. 4, pp. 1001–1006, 2015.
  • [21] A. Leeson, A. Shepherd, K. Briggs, I. Howat, X. Fettweis, M. Morlighem, and E. Rignot, “Supraglacial lakes on the greenland ice sheet advance inland under warming climate,” Nature Climate Change, vol. 5, no. 1, pp. 51–55, 2015.
  • [22] A. J. Tedstone, P. W. Nienow, N. Gourmelen, A. Dehecq, D. Goldberg, and E. Hanna, “Decadal slowdown of a land-terminating sector of the greenland ice sheet despite warming,” Nature, vol. 526, no. 7575, pp. 692–695, 2015.
  • [23] L. S. Koenig, A. Ivanoff, P. M. Alexander, J. A. MacGregor, X. Fettweis, B. Panzer, J. D. Paden, R. R. Forster, I. Das, J. R. McConnell et al., “Annual greenland accumulation rates (2009–2012) from airborne snow radar,” The Cryosphere, vol. 10, no. 4, pp. 1739–1752, 2016.
  • [24] E. Rignot and P. Kanagaratnam, “Changes in the velocity structure of the greenland ice sheet,” Science, vol. 311, no. 5763, pp. 986–990, 2006.
  • [25] A. F. Banwell, N. S. Arnold, I. C. Willis, M. Tedesco, and A. P. Ahlstrøm, “Modeling supraglacial water routing and lake filling on the greenland ice sheet,” Journal of Geophysical Research: Earth Surface, vol. 117, no. F4, 2012.
  • [26] M. McMillan, A. Muir, A. Shepherd, R. Escolà, M. Roca, J. Aublanc, P. Thibaut, M. Restano, A. Ambrozio, and J. Benveniste, “Sentinel-3 delay-doppler altimetry over antarctica,” The Cryosphere, vol. 13, no. 2, pp. 709–722, 2019.
  • [27] P. Hochreuther, N. Neckel, N. Reimann, A. Humbert, and M. Braun, “Fully automated detection of supraglacial lake area for northeast greenland using sentinel-2 time-series,” Remote Sensing, vol. 13, no. 2, p. 205, 2021.
  • [28] F. Takens, “Detecting strange attractors in turbulence,” in Dynamical Systems and Turbulence, Warwick 1980: proceedings of a symposium held at the University of Warwick 1979/80.   Springer, 2006, pp. 366–381.
  • [29] R. J. Povinelli, M. T. Johnson, A. C. Lindgren, and J. Ye, “Time series classification using gaussian mixture models of reconstructed phase spaces,” IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 6, pp. 779–783, 2004.
  • [30] H. Kantz and T. Schreiber, Nonlinear time series analysis.   Cambridge university press, 2003.
  • [31] D. A. Reynolds et al., “Gaussian mixture models.” Encyclopedia of biometrics, vol. 741, no. 659-663, 2009.
  • [32] T. K. Moon, “The expectation-maximization algorithm,” IEEE Signal processing magazine, vol. 13, no. 6, pp. 47–60, 1996.
  • [33] M. O. Gani, T. Fayezeen, R. J. Povinelli, R. O. Smith, M. Arif, A. J. Kattan, and S. I. Ahamed, “A light weight smartphone based human activity recognition system with high accuracy,” Journal of Network and Computer Applications, vol. 141, pp. 59–72, 2019.
  • [34] D. Dunmire, A. F. Banwell, N. Wever, J. T. Lenaerts, and R. T. Datta, “Contrasting regional variability of buried meltwater extent over two years across the Greenland Ice Sheet - data,” May 2021. [Online]. Available: https://doi.org/10.5281/zenodo.4813833
  • [35] D. Dunmire, A. Subramanian, E. Hossain, M. O. Gani, A. Banwell, H. Younas, and B. M. Myers, “Greenland ice sheet wide supraglacial lake evolution and dynamics: insights from the 2018 and 2019 melt seasons,” Earth ArXiv Preprint, 2024.
  • [36] A. Subramanian, D. Dunmire, E. Hossain, M. O. Gani, A. Banwell, and B. Myers, “The fate of greenland ice sheet supraglacial lakes in a warm and cool year,” Copernicus Meetings, Tech. Rep., 2024.
  • [37] E. Rignot and J. Mouginot, “Ice flow in greenland for the international polar year 2008–2009,” Geophysical Research Letters, vol. 39, no. 11, 2012.
  • [38] N. Gorelick, M. Hancher, M. Dixon, S. Ilyushchenko, D. Thau, and R. Moore, “Remote Sensing of Environment Google Earth Engine : Planetary-scale geospatial analysis for everyone,” Remote Sensing of Environment, 2017.
  • [39] sktime, “sktime: Time series classification, regression, clustering & more,” 2024, accessed: 2024-07-30. [Online]. Available: https://www.sktime.net/en/v0.31.0/examples/02_classification.html
  • [40] E. Hossain, S. Ali, Y. Huang, N.-J. Schlegel, J. Wang, A. C. Subramanian, and M. O. Gani, “Incorporating causality with deep learning in predicting short-term and seasonal sea ice,” in 104th AMS Annual Meeting.   AMS, 2024.