A versatile framework for analyzing galaxy image data by incorporating Human-in-the-loop in a large vision model

Ming-Xiang Fu; Yu Song; Jia-Meng Lv; Liang Cao; Peng Jia; Nan Li; Xiang-Ru Li; Ji-Feng Liu; A-Li Luo; Bo Qiu; Shi-Yin Shen; Liang-Ping Tu; Li-Li Wang; Shou-Lin Wei; Hai-Feng Yang; Zhen-Ping Yi; Zhi-Qiang Zou

doi:10.1088/1674-1137/ad50ab

Chinese Physics C> 2024, Vol. 48> Issue(9) : 095001 DOI: 10.1088/1674-1137/ad50ab CSTR: 32044.14.1674-1137/ad50ab

A versatile framework for analyzing galaxy image data by incorporating Human-in-the-loop in a large vision model

Ming-Xiang Fu ^1,3,4,1, ,
Yu Song ^2,1 ,
Jia-Meng Lv ^2,1 ,
Liang Cao ² ,
Peng Jia ^2,, ,
Nan Li ^{1,3,4,,

,} ,
Xiang-Ru Li ⁵ ,
Ji-Feng Liu ^1,4 ,
A-Li Luo ^3,6,7 ,
Bo Qiu ⁸ ,
Shi-Yin Shen ⁹ ,
Liang-Ping Tu ¹⁵ ,
Li-Li Wang ¹⁰ ,
Shou-Lin Wei ¹¹ ,
Hai-Feng Yang ¹² ,
Zhen-Ping Yi ¹³ ,
Zhi-Qiang Zou ^7,14

1.
National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100101, China
2.
College of Electronic Information and Optical Engineering, Taiyuan University of Technology, Taiyuan 030024, China
3.
School of Astronomy and Space Science, University of Chinese Academy of Sciences, Beijing 101408, China
4.
Key lab of Space Astronomy and Technology, National Astronomical Observatories, Beijing 100101, China
5.
School of Computer Science, South China Normal University, Guangzhou 510631, China
6.
CAS Key Laboratory of Optical Astronomy, National Astronomical Observatories, Beijing 100101, China
7.
University of Chinese Academy of Sciences, Nanjing, Nanjing 211135, China
8.
University of Science and Technology Beijing, Beijing 100083, China
9.
Shanghai Astronomical Observatory, Chinese Academy of Sciences, Shanghai 200030, China
10.
School of Computer and Information, Dezhou University, Dezhou 253023, China
11.
Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
12.
School of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan 30024, China
13.
School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai 264209, China
14.
Nanjing University of Posts & Telecommunications, Nanjing 210023, China
15.
School of Science, University of Science and Technology Liaoning, Anshan 114051, China

Abstract
HTML
Reference
Related

PDF

Abstract：
The exponential growth of astronomical datasets provides an unprecedented opportunity for humans to gain insight into the Universe. However, effectively analyzing this vast amount of data poses a significant challenge. In response, astronomers are turning to deep learning techniques, but these methods are limited by their specific training sets, leading to considerable duplicate workloads. To overcome this issue, we built a framework for the general analysis of galaxy images based on a large vision model (LVM) plus downstream tasks (DST), including galaxy morphological classification, image restoration, object detection, parameter extraction, and more. Considering the low signal-to-noise ratios of galaxy images and the imbalanced distribution of galaxy categories, we designed our LVM to incorporate a Human-in-the-loop (HITL) module, which leverages human knowledge to enhance the reliability and interpretability of processing galaxy images interactively. The proposed framework exhibits notable few-shot learning capabilities and versatile adaptability for all the abovementioned tasks on galaxy images in the DESI Legacy Imaging Surveys. In particular, for the object detection task, which was trained using 1000 data points, our DST in the LVM achieved an accuracy of 96.7%, while ResNet50 plus Mask R-CNN reached an accuracy of 93.1%. For morphological classification, to obtain an area under the curve (AUC) of ~0.9, LVM plus DST and HITL only requested 1/50 of the training sets that ResNet18 requested. In addition, multimodal data can be integrated, which creates possibilities for conducting joint analyses with datasets spanning diverse domains in the era of multi-messenger astronomy.
- artificial intelligence ,
- large vision model ,
- human-in-the-loop ,
- astronomy ,
- galaxies

References

[1]	M. Agarwal, J. Alameda, J. Audenaert et al., (2023), arXiv: 2306.08106
[2]	S. Li and A. Adelmann, Phys. Rev. Accel. Beams 26, 024801 (2023) doi: 10.1103/PhysRevAccelBeams.26.024801
[3]	A. Boehnlein, M. Diefenthaler, N. Sato et al., Rev. Mod. Phys. 94, 031003 (2022) doi: 10.1103/RevModPhys.94.031003
[4]	R. Suresh, H. Bishnoi, A. V. Kuklin et al., Frontiers in Physics 12, 1322162 (2024) doi: 10.3389/fphy.2024.1322162
[5]	Y. Zhang and Y. Zhao, Data Science Journal 14, 11 (2015) doi: 10.5334/dsj-2015-011
[6]	M. Huertas-Company and F. Lanusse, Publications of the Astronomical Society of Australia 40, (2023), arXiv:2210.01813 doi: 10.1017/pasa.2022.55
[7]	B. Lao, T. An, A. Wang et al., Science bulletin 66, 2145 (2021) doi: 10.1016/j.scib.2021.07.015
[8]	M. Banerji, O. Lahav, C. J. Lintott et al., Monthly Notices of the Royal Astronomical Society 406, 342 (2010) doi: 10.1111/j.1365-2966.2010.16713.x
[9]	C. Wu, O. I. Wong, L. Rudnick et al., Monthly Notices of the Royal Astronomical Society 482, 1211 (2019) doi: 10.1093/mnras/sty2646
[10]	Y. B. Li, A. L. Luo, C. D. Du et al., The Astrophysical Journal Supplement Series 234, 31 (2018) doi: 10.3847/1538-4365/aaa415
[11]	J. Xu, Q. Yin, P. Guo et al., Monthly Notices of the Royal Astronomical Society 499, 1972 (2020) doi: 10.1093/mnras/staa2883
[12]	G. Martin, S. Kaviraj, A. Hocking et al., Monthly Notices of the Royal Astronomical Society 491, 1408 (2020) doi: 10.1093/mnras/stz3006
[13]	C. Logan and S. Fotopoulou, Astronomy & Astrophysics 633, A154 (2020), arXiv:1911.05107 doi: 10.1051/0004-6361/201936648
[14]	Q. Xu, S. Shen, R. S. de Souza et al., Monthly Notices of the Royal Astronomical Society 526, 6391 (2023) doi: 10.1093/mnras/stad3181
[15]	M. A. Hayat, G. Stein, P. Harrington et al., The Astrophysical Journal Letters 911, L33 (2021), arXiv:2012.13083 doi: 10.3847/2041-8213/abf2c7
[16]	P. Jia, R. Sun, W. Wang et al., Monthly Notices of the Royal Astronomical Society 470, 1950 (2017) doi: 10.1093/mnras/stx1336
[17]	W. Wang, P. Jia, D. Cai et al., Monthly Notices of the Royal Astronomical Society 478, 5671 (2018) doi: 10.1093/mnras/sty1504
[18]	S. Ni, Y. Li, L. Y. Gao et al., The Astrophysical Journal 934, 83 (2022), arXiv:2204.02780 doi: 10.3847/1538-4357/ac7a34
[19]	L. Y. Gao, Y. Li, S. Ni et al., Monthly Notices of the Royal Astronomical Society 525, 5278 (2023), arXiv:2212.08773 doi: 10.1093/mnras/stad2646
[20]	P. Jia, Q. Jia, T. Jiang et al., The Astronomical Journal 165, 233 (2023) doi: 10.3847/1538-3881/accceb
[21]	P. Jia, Q. Jia, T. Jiang et al., Astronomy and Computing 100732 (2023) doi: 10.1016/j.ascom.2023.100732
[22]	Z. Liu, Y. Lin, Y. Cao et al., in Proceedings of the IEEE/CVF international conference on computer vision, 10012–10022 (2021)
[23]	G. Stein, P. Harrington, J. Blaum et al., (2021), arXiv: 2110.13151
[24]	A. Dey, D. J. Schlegel, D. Lang et al., The Astronomical Journal 157, 168 (2019) doi: 10.3847/1538-3881/ab089d
[25]	M. Grinberg, Flask web development: developing web applications with python, (O’Reilly Media, Inc. 2018)
[26]	Y. LeCun, Y. Bengio, and G. Hinton, Nature 521, 436 (2015) doi: 10.1038/nature14539
[27]	J. Devlin, M. W. Chang, K. Lee et al., (2018), arXiv: 1810.04805
[28]	Z. Dai, Z. Yang, Y. Yang et al., (2019), arXiv: 1901.02860
[29]	T. Brown, B. Mann, N. Ryder et al., Advances in neural information processing systems 33, 1877 (2020) doi: 10.5555/3495724.3495883
[30]	H. Touvron, T. Lavril, G. Izacard et al., (2023), arXiv: 2302.13971
[31]	A. Kirillov, E. Mintun, N. Ravi et al., (2023), arXiv: 2304.02643
[32]	F. Lanusse, L. Parker, S. Golkar et al., (2023), arXiv: 2310.03024
[33]	G. Stein, J. Blaum, P. Harrington et al., The Astrophysical Journal 932, 107 (2022), arXiv:2110.00023 doi: 10.3847/1538-4357/ac6d63
[34]	C. M. Fan, T. J. Liu, and K. H. Liu, in 2022 IEEE International Symposium on Circuits and Systems (ISCAS), 2333–2337, (IEEE2022)
[35]	K. H. R. Chan, Y. Yu, C. You et al., J. Mach. Learn. Res. 23, 1 (2022)
[36]	K. He, X. Chen, S. Xie et al., Masked autoencoders are scalable vision learners (2021), arXiv: 2111.06377
[37]	Z. Xie, Z. Zhang, Y. Cao et al. (2022), arXiv: 2111.09886
[38]	G. Bradski, Dr. Dobb’s Journal of Software Tools (2000).
[39]	X. P. Zhu, J. M. Dai, C. J. Bian et al., Astrophysics and Space Science 364, 1 (2019) doi: 10.1007/s10509-018-3489-5
[40]	K. Schawinski, C. Zhang, H. Zhang et al., Monthly Notices of the Royal Astronomical Society: Letters 467, L110 (2017) doi: 10.1093/mnrasl/slx008
[41]	M. Walmsley, C. Lintott, T. Géron et al., Monthly Notices of the Royal Astronomical Society 509, 3966 (2022), arXiv:2102.08414 doi: 10.1093/mnras/stab2093
[42]	A. Krizhevsky, I. Sutskever, and G. E. Hinton, Communications of the ACM 60, 84 (2017) doi: 10.1145/3065386
[43]	K. Simonyan and A. Zisserman, (2014), arXiv: 1409.1556
[44]	K. He, X. Zhang, S. Ren, and J. Sun, in Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778 (2016)
[45]	K. He, G. Gkioxari, P. Dollár et al., in Proceedings of the IEEE international conference on computer vision, 2961–2969 (2017)
[46]	X. Wu, L. Xiao, Y. Sun et al., Future Generation Computer Systems 135, 364 (2022) doi: 10.1016/j.future.2022.05.014
[47]	D. E. Rumelhart and J. L. McClelland, Learning Internal Representations by Error Propagation, 318–362 (1987)
[48]	Z. Zhang, Z. Zou, N. Li et al., Research in Astronomy and Astrophysics 22, 055002 (2022), arXiv:2202.08172 doi: 10.1088/1674-4527/ac5732
[49]	C. Hahn, M. J. Wilson, O. Ruiz-Macias et al., The Astronomical Journal 165 , 253 (2023), arXiv: 2208.08512
[50]	Y. Du, L. Cui, X. Guan et al., Physics 53 , 147 (2024)
[51]	P. Martinez-Azcona, A. Kundu, A. del Campo et al., Phys. Rev. Lett. 131, 160202 (2023) doi: 10.1103/PhysRevLett.131.160202

Access

Figures(10) / Tables(4)

Get Citation

Ming-Xiang Fu, Yu Song, Jia-Meng Lv, Liang Cao, Peng Jia, Nan Li, Xiang-Ru Li, Ji-Feng Liu, A-Li Luo, Bo Qiu, Shi-Yin Shen, Liang-Ping Tu, Li-Li Wang, Shou-Lin Wei, Hai-Feng Yang, Zhen-Ping Yi and Zhi-Qiang Zou. A Versatile Framework for Analyzing Galaxy Image Data by Implanting Human-in-the-loop on a Large Vision Model[J]. Chinese Physics C. doi: 10.1088/1674-1137/ad50ab

RIS(for EndNote,Reference Manager,ProCite)

BibTex

Txt

Milestone

Received: 2024-03-13

Article Metric

Article Views(4575)
PDF Downloads(44)
Cited by(0)

Policy on re-use

To reuse of subscription content published by CPC, the users need to request permission from CPC, unless the content was published under an Open Access license which automatically permits that type of reuse.

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

HTML

I. INTRODUCTION

A vast expansion of available data is an invaluable resource across various scientific disciplines, notably physics and astronomy, because it offers many opportunities and challenges for human beings to understand the universe. Artificial intelligence (AI) techniques have emerged as a leading approach for comprehending the complexities intrinsic to big data challenges in physics, such as interpreting data collected from large-scale sky surveys, gravitational wave detectors, and colliders. These datasets are more than an order of magnitude larger in size than previous datasets, but require a shorter processing time to promptly respond to transient events [1]. They have also supported significant successes, such as the prediction of multivariate time series data drawn from particle accelerators [2], the execution of many-body variational calculations in nuclear physics [3], and many other accomplishments in experimental and theoretical physics (see Ref. [4] and references therein).

The big data challenge in astronomy and astrophysics is especially important [5] because large-scale sky surveys such as LSST¹, Euclid², CSST³, and SKA⁴ continue to gather data, leading astronomy and astrophysics into an exciting new era. However, the vast and intricate nature of astronomical datasets poses a significant challenge to astronomers who want to extract meaningful scientific information. Deep learning techniques have been used to address this difficulty (see [6] and references therein). For example, astronomers have leveraged specific data in supervised learning to teach computers how to solve problems, which has been successful in detecting celestial objects [7], classifying their morphology [8, 9] and identifying their spectra [10, 11]. In addition, unsupervised learning algorithms can explore unlabeled data and have demonstrated their effectiveness in classifying galaxy types [12−15] and in characterizing (or improving) the performance of telescopes [16−19]. Furthermore, reinforcement learning algorithms have succeeded in various applications, such as efficiently managing instruments via developing simulators and enabling interactions with observations [20, 21].

However, for the machine learning-based applications discussed above, certain issues still need to be addressed, including interpretability, data labeling, and universality. Persistent issues that hinder their advancement and utility require preparing separate training sets and constructing distinct models for different tasks. Despite this, various tasks may share a common foundation of prior information about celestial objects. For example, tasks such as detecting strong gravitational lensing systems, identifying different types of nebulae or galaxies, and segmenting galaxies share the same need for multi-color structural features. Therefore, creating a foundational model that provides general information and attaches subprocesses for multiple purposes is sensible. Moreover, effectively training a machine learning algorithm typically requires thousands of data units, further exacerbating matters, as obtaining specific data and labels (e.g., the positions of rare astronomical targets or segmentation labels for galaxies) is complex. Therefore, an interactive technique is ideal for building training sets from scratch and maintaining their development.

To overcome the abovementioned shortcomings of existing applications of deep learning to astronomical vision tasks, especially galaxy image processing tasks, we have developed a comprehensive framework containing a foundational model, multiple machine learning models for downstream tasks, and a human-in-the-loop (HITL) interface. The foundational model is based on the Swin-Transformer model [22], and the galaxy images from the ssl-legacysurvey project [23], which contains 76 million galaxy images extracted from the Dark Energy Spectroscopic Instrument (DESI) Legacy Survey [24] Data Release 9, are selected as pre-training data. Covering 14000 square degrees of extragalactic sky in three optical bands (g, r, z), these data constitute a relatively complete description of galaxies in the nearby universe. Different neural networks are then attached to the trained model for downstream tasks, including classification, image restoration, and outlier detection. The model requires far fewer training samples than the current supervised learning algorithms and is suitable for various purposes. To further enhance the performance of the model, a HITL module based on the FLASK web framework [25] is connected to our framework. This module takes advantage of human knowledge to further decrease the workload of data labeling and to improve the reliability, universality, and interpretability of the framework for different image processing tasks.

II. A FOUNDATIONAL VISION MODEL FOR ASTRONOMY

III. TRAINING THE LARGE VISION MODEL FOR MULTIPLE DOWNSTREAM TASKS

IV. DEPLOYMENT OF TWO SAMPLE APPLICATIONS

V. LARGE VISION MODEL WITH THE HUMAN-IN-THE-LOOP MODULE

VI. SUMMARY

In this study, we created a framework that utilized a HITL module on top of an LVM for various astronomical vision tasks. The downstream neural networks, combined with the LVM, allowed for versatility without the need for expensive re-training. Furthermore, the HITL module incorporated human knowledge to guide the AI model toward specific objectives, which reduced the workload for composing training sets and enhanced the framework's universality and interpretability. The experiments showed that our framework outperformed traditional supervised machine learning models in classical vision tasks in astronomy, such as object detection, galaxy morphological classification, and observational image reconstruction. Considering that the reliability of AI models in handling scientific data is crucial for valid discoveries [6, 50, 51], we evaluated our framework's reliability through different experiments using labels in the Galaxy Zoo 2 datasets. However, for data in the bands other than g, r, and z and those provided by space-borne telescopes, Galaxy Zoo 2 was insufficient. Therefore, to assess the framework's reliability in a broader context in the future, we are planning on constructing a standard dataset of galaxy images covering a larger feature space from various observations. Using the transfer learning strategy, we plan on extending the framework to encompass various data modalities, including photometry, spectra, and lightcurves. This will lead to a continually evolving AI model that can proficiently handle intricate datasets from a variety of major observing projects (e.g., DESI, LSST, Euclid, and CSST), which is crucial in the age of multi-messenger astronomy.

ACKNOWLEDGMENTS

The authors thank George Stein for publicly sharing the cutouts of galaxy images from DESI Legacy Surveys online.

The DESI Legacy Imaging Surveys consist of three individual and complementary projects: the Dark Energy Camera Legacy Survey (DECaLS), the Beijing-Arizona Sky Survey (BASS), and the Mayall z-band Legacy Survey (MzLS). DECaLS, BASS and MzLS together include data obtained, respectively, at the Blanco telescope, Cerro Tololo Inter-American Observatory, NSF’s NOIRLab; the Bok telescope, Steward Observatory, University of Arizona; and the Mayall telescope, Kitt Peak National Observatory, NOIRLab. NOIRLab is operated by the Association of Universities for Research in Astronomy (AURA) under a cooperative agreement with the National Science Foundation. Pipeline processing and analyses of the data were supported by NOIRLab and the Lawrence Berkeley National Laboratory (LBNL). Legacy Surveys also uses data products from the Near-Earth Object Wide-field Infrared Survey Explorer (NEOWISE), a project of the Jet Propulsion Laboratory/California Institute of Technology, funded by the National Aeronautics and Space Administration. Legacy Surveys was supported by: the Director, Office of Science, Office of High Energy Physics of the U.S. Department of Energy; the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility; the U.S. National Science Foundation, Division of Astronomical Sciences; the National Astronomical Observatories of China, the Chinese Academy of Sciences and the Chinese National Natural Science Foundation. LBNL is managed by the Regents of the University of California under contract to the U.S. Department of Energy. The complete acknowledgments can be found here¹⁰.

Reference (51)

Classification Task	Acc	Pre	Recall	F1
Pre_train model	0.784	0.767	0.794	0.771
Multi_uniform model	0.842	0.844	0.850	0.846
Multi_activate model	0.854	0.823	0.834	0.844

Restoration Task	MSE	PSNR	SSIM
Blurred images	0.00094	31.11	0.48
Pre_train model	0.00084	31.31	0.51
Multi_uniform model	0.00083	31.35	0.54
Multi_activate model	0.00049	33.34	0.56

Reconstruction Task	MSE	PSNR	SSIM
Masked images	0.0248	15.64	0.36
Pre_train model	0.0089	22.36	0.49
Multi_uniform model	0.0040	26.07	0.61
Multi_activate model	0.0038	26.84	0.64

Patch size	MSE	PSNR	SSIM
$ 4\times4 $ (Masked images)	0.03187	15.92	0.27
Multi_activate model	0.0067	26.54	0.58
$ 8\times8 $ (Masked images)	0.023	17.55	0.36
Multi_activate model	0.0049	24.81	0.55
$ 16\times16 $ (Masked images)	0.03149	17.47	0.42
Multi_activate model	0.0112	23.61	0.48

A versatile framework for analyzing galaxy image data by incorporating Human-in-the-loop in a large vision model

Abstract：

References

Access

Article Metrics

Metrics

通讯作者: 陈斌, bchen63@163.com

Email This Article

A versatile framework for analyzing galaxy image data by incorporating Human-in-the-loop in a large vision model

Corresponding author: Peng Jia, robinmartin20@gmail.com

Corresponding author: Nan Li, nan.li@nao.cas.cn

HTML

A. Design of the large vision model

B. Pre-training of the large vision model

A. Training of multiple downstream tasks

B. Performance evaluation

A. Classifying galaxy morphology with few-shot learning based on LVM

B. Identifying strong lensing systems with the LVM + Mask R-CNN

A. Design of the human-in-the-loop module

B. Comparison to conventional models of supervised learning

C. Discovering targets in the DESI Bright Galaxy Survey using LVM + HITL

目录