ROSES ID: NNH21ZDA001N-LWSTM Selection Year: 2021
Program Element: Data, Tools, & Methods
Principal Investigator: James Paul Mason
Affiliation(s): Johns Hopkins University
Project Member(s):
Jin, Meng Co-I SETI Institute
Cheung, Chun Ming Mark Collaborator Lockheed Martin Inc.
Summary:
Since the publication of A Machine-learning Data Set Prepared from the NASA Solar Dynamics Observatory Mission" (herein called SDOML) in 2019, there has been a new publication using this dataset every other month. The utility of having the data of all three instruments onboard SDO wrapped up into one uniform package with the data already cleaned is clear. However, as key members of the team that originated this dataset, we have identified improvements that should be made that would further increase the utility, ease of access, and therefore science return. We will focus primarily on 4 new improvements, resulting in the release of SDOMLv2. First, we will repacketize all of the data into the .zarr format. This is a new, well-supported format specifically designed for handling large, multi-dimensional arrays, especially for use in cloud computing. It is the preferred format for platforms such as Pangeo. This will enable faster manipulation of the data for users. Second, we will generate the synthetic SDO/EVE emission lines data product for the entire timespan of the dataset (2010-2021) at full cadence (6 minutes, the SDO/AIA cadence). This method accepts SDO/AIA data as input and has already been demonstrated in case studies. SDO/EVE's 60-360 … channel ceased functioning in 2014. This new synthetic data restoration will re-enable all of the scientific analyses that this channel previously afforded, for example irradiance coronal dimming studies that this proposal's PI is heavily invested in. Third, we will include the full, cleaned SDO/EVE spectra in the dataset. SDOMLv1 only includes the extracted emission lines product. This will enable more detailed scientific analyses that require the full spectrum, such as the study of Doppler shifts during eruptions. Finally, we will build open source tools and an example gallery to demonstrate the access, manipulation, and some use cases for SDOMLv2, with emphasis on cloud computing.