Thank you for your interest in SuperSynthIA! You can find the project website here. (https://rw3544.github.io/SuperSynthIA/) This file will guide you through the repo and explain how to use our code.
If you want us to produce data for you, please fill out this form.
| File | Description |
|---|---|
| demo.py | Produce predictions according to config provided. Instruction for config is here. |
| get_iquv.py | Download input fits files from JSOC. Details here |
| verify_data.py | Check if there is missing or incorrectly named fits files in input data. Deatils here |
| visualize.py | Visualizing Predictions saved as FITS Files |
| postprocessing.py | For Br, Generate unreliable mask and fix holes in predictions. Details here. |
| analytic_disambig.py | Analytically calcualte |
| fuse.py | Fuse polar region analytic predictions to direct ones. Details here |
By default, SuperSynthIA outputs everything into fits files. To load them, use:
import astropy.io.fits as fits
arr = (fits.open(FITS_LOCATION))[1].data
This repo contains an environment.yml file that contains all necessary packages. Please run the following code to build a compatible conda environment.
conda env create -f environment.yml
conda activate SYIA
If you run into conflicts, we also provide environment_compatible.yml, which list version for essential packages only.
- The trained model weights can be downloaded here. (https://www.dropbox.com/scl/fi/0efn6xx8cwupyeejgtg7l/SYIA_ORIG_MODEL.tar.gz?rlkey=xfv7xu9ep4hbzz2zcxzzcvxb4&st=opew50xc&dl=0)
In our default setup, it is recommended to go to the root folder of this repository and then run the following code to download model to ./Orig_Trained_Models
wget -O SYIA_ORIG_MODEL.tar.gz "https://www.dropbox.com/scl/fi/0efn6xx8cwupyeejgtg7l/SYIA_ORIG_MODEL.tar.gz?rlkey=xfv7xu9ep4hbzz2zcxzzcvxb4&st=2pu3yzpw&dl=1"
tar -xzf SYIA_ORIG_MODEL.tar.gz
- We have also realeased model for Fill Factor here. (https://www.dropbox.com/scl/fo/w0ipykwubvkvbrcimmewr/AEqAIFC4eo4GGfqhnSJ3l-0?rlkey=ysatviwlsjm95hjpa00tzd3c9&e=1&st=l7wsyz8x&dl=0)
To use it, place the 'epoch=100.checkpoint.pth' file in folder './Orig_Trained_Models/spInv_Stray_Light_Fill_Factor' and pull the latest version of our code for necessary normalization parameters.
- SuperSynthIA runs on the hmi.S_720s data series (level 1p IQUV data averaged at a cadence of 12 minutes) as inputs. One can download such data from JSOC. Here is an example request:
hmi.S_720s[2024.05.07_00:48:00-2024.05.20_23:48:00@1h]We also provide a small dataset for your convenience to test the code. If you run the following code, you can find the dataset in ./SMALL_DATASET
wget -O SYIA_SMALL_DATASET.tar.gz "https://www.dropbox.com/scl/fi/6fivy89cq8f5xcdrrf7ug/SYIA_SMALL_DATASET.tar.gz?rlkey=erj4gth5rru85eamiqr2aunnq&st=y8sw3uu7&dl=1"
tar -xzf SYIA_SMALL_DATASET.tar.gz
-
We also provide a python script get_iquv.py for your convenience to download data from JSOC easily. The steps are:
- Go to JSOC, fill in your request in
RecordSet(ex:hmi.S_720s[2024.05.07_00:48:00-2024.05.20_23:48:00@1h]) - After the request have been submitted, the website will provide a
RequestIDfor your reference. (ex:JSOC_20240803_001234) Wait a while for the request to be completed. - When the data is ready, a link will appear in
Data Location. (ex: https://jsoc1.stanford.edu/SUM14/D1776011447/S00000/) - Run
python get_iquv.py save_location link.save_locationis the directory to store IQUV files,linkis the url obtained from JSOC in step 3. (ex:python get_iquv.py ./Input_Data https://jsoc1.stanford.edu/SUM14/D1776011447/S00000/)
- Go to JSOC, fill in your request in
-
To ensure the completeness and correctness of your downloaded data, you can use verify_data.py. Simply modify the
directoryvariable, and for each timestamp, the script will verify the presence of all 24 I,Q,U,V FITS files. Any missing files will be listed. It will also check if all fits files are named correctly likehmi.S_720s.20100924_110000_TAI.1.U1.fits. It is worth noting that sometimes fits files provided by JSOC maybe incorrectly named to be something likehmi.S_720s.202192.V5.fits. If such thing occurs, verify_data.py offers an option to delete all the incorrectly named files. Then, you can just request that data again through JSOC and download the correctly named ones.
Please note that JSOC advises against multi-threaded downloads, as this may cause your connection to drop and result in blank files. It is recommended to run one download process at a time.
After you have set up conda environment and downloaded model weights:
- Please check the demo.json in ./configs folder as an example. Change the corresponding parameters and build your own json config file.
- Go to demo.py, change
config_locationto the location of your config file. Then, runpython demo.py
For detailed explaination of each parameter in the config file, click here.
Note that all predictions are saved as fits files, to load them, use
import astropy.io.fits as fits
arr = (fits.open(FITS_LOCATION))[1].data
SuperSynthIA can occasionally produce small holes in
- Mask for pointing out pixels that the model is uncertain about in Br
- Using thin-plate spline interpolation to fix holes detected (1), saving outputs to a separate directory
- Same as (2), but overwrites predictions with corrected ones.
SuperSynthIA supports analytic method to produce disambiguated components (
As discussed in Section 4.4 of the paper, the direct and analytic methods each have their own trade-offs. The analytic method's predictions more closely resemble the Hinode pipeline results in regions of weak polarization, particularly at the poles. Therefore, we provide fuse.py, which merges the polar region predictions of the analytic method with predictions of the direct method for those who are interested.
You can find the SuperSynthIA training code in the helpers/training_code directory. We will also be releasing our training dataset (~1TB) soon. Please note that the training code is not actively maintained, as we are currently developing an improved version that we aim to release in the near future.
- This issue is typically caused by the
CHUNK_SIZEparameter in full_disk_utils.py being set too high. For an A40 GPU with 48GB of memory, we default this value to 1024 to optimize processing speed. However, for GPUs with smaller memory capacities, it is recommended to use aCHUNK_SIZEof 256. Please ensure thatCHUNK_SIZEis a multiple of 64 and that 4096 is divisible byCHUNK_SIZE.
- BINS/ for the bins to be used for classification for each output
- Norm_params/ for the normalization parameters for each output
We employ logit dithering to introduce noise into the log-probabilities predicted by the model, which can cause slight variations in predictions for the same input. To ensure reproducibility, we use the timestamp and patch location as keys to calculate a seed for generating deterministic noise.
4. How is the mask in postprocessing calculated?
Given the SuperSynthIA predictions for continuum_process_file in full_disk_utils.py.
5. Why some fits files downloaded from JSOC have abnormal file names?
Occasionally, IQUV FITS files downloaded from JSOC may have incorrect filenames, such as hmi.S_720s.202192.V5.fits, instead of the correct format like hmi.S_720s.20100924_110000_TAI.1.U1.fits. If this happens, you can use verify_data.py to identify and remove all improperly named files. Afterward, request the data again to download the correctly named files.
-
MODEL_LOCATION
- Description: Path to the directory containing the trained models.
- Example:
"MODEL_LOCATION": "./Orig_Trained_Models"
-
GLOBAL_DEVICE
- Description: Device to be used for computation (e.g., "cuda" or "cpu").
- Example:
"GLOBAL_DEVICE": "cuda"
-
output_name_list
- Description: List of output names to be generated. Choose from:
"spDisambig_Bp""spDisambig_Br""spDisambig_Bt""spDisambig_Field_Azimuth_Disamb""spInv_aB""spInv_Field_Azimuth""spInv_Field_Inclination"
- Example:
"output_name_list": [ "spDisambig_Bp", "spDisambig_Br", "spDisambig_Bt", "spInv_aB", "spInv_Field_Inclination", "spDisambig_Field_Azimuth_Disamb" ]
- Description: List of output names to be generated. Choose from:
-
IQUV_DATA_DIR
- Description: Directory containing the input data (IQUV fits files from hmi720s, can be downloaded from JSOC).
- Example:
"IQUV_DATA_DIR": "./SMALL_DATASET"
-
OUTPUT_DIR
- Description: Directory where the output will be saved.
- Example:
"OUTPUT_DIR": "./Predictions"
-
reproducible
- Description: Whether to make the process reproducible using a fixed 64x64 triangular noise for dithering. This process guarentees when the environment and 'CHUNK_SIZE' in full_disk_utils.py is fixed, the model will produce the same results. It also depends on timestamp extracted from input fits file name. For a more detailed explaination, check here.
- Example:
"reproducible": true
-
save_std
- Description: Whether to save the per-pixel standard deviation.
- Example:
"save_std": false
-
save_CI
- Description: Whether to save the per-pixel confidence intervals (90%).
- Example:
"save_CI": false"
-
save_orig_logit
- Description: Whether to save the original logits produced by the model, which can be used to calculate probabilities of each bin (about 5Gb per input).
- Example:
"save_orig_logit": false"
-
use_parallel
- Description: Whether to use parallel processing.
- Example:
"use_parallel": true"
-
max_parallel_workers
- Description: Maximum number of parallel workers. It is recommended to set this to the number of CPUs on your machine, and reduce if you encounter memory issues.
- Example:
"max_parallel_workers": 8"
-
postprocessing_mode
- Description: Postprocessing mode. Choose from:
"none""generate_mask""fix_hole_overwrite""fix_hole_save_to_new"
- Note: For producing aB/Inclination/Azimuth only, set to
"none"to avoid any issues. - Example:
"postprocessing_mode": "none"
- Description: Postprocessing mode. Choose from:
Here is an example configuration file:
{
"MODEL_LOCATION": "./Orig_Trained_Models",
"GLOBAL_DEVICE": "cuda",
"output_name_list": ["spDisambig_Bp", "spDisambig_Br", "spDisambig_Bt", "spInv_aB", "spInv_Field_Inclination", "spDisambig_Field_Azimuth_Disamb"],
"IQUV_DATA_DIR": "./SMALL_DATASET",
"OUTPUT_DIR": "./Predictions",
"reproducible": true,
"save_std": false,
"save_CI": false,
"save_orig_logit": false,
"use_parallel": true,
"max_parallel_workers": 8,
"postprocessing_mode": "none"
}