2022A&A...666A.122M


Query : 2022A&A...666A.122M

2022A&A...666A.122M - Astronomy and Astrophysics, volume 666A, 122 (2022/10-1)

A machine-learning photometric classifier for massive stars in nearby galaxies I. The method.

MARAVELIAS G., BONANOS A.Z., TRAMPER F., DE WIT S., YANG M. and BONFINI P.

Abstract (from CDS):


Context. Mass loss is a key parameter in the evolution of massive stars. Despite the recent progress in the theoretical understanding of how stars lose mass, discrepancies between theory and observations still hold. Moreover, episodic mass loss in evolved massive stars is not included in models, and the importance of its role in the evolution of massive stars is currently undetermined.
Aims. A major hindrance to determining the role of episodic mass loss is the lack of large samples of classified stars. Given the recent availability of extensive photometric catalogs from various surveys spanning a range of metallicity environments, we aim to remedy the situation by applying machine-learning techniques to these catalogs.
Methods. We compiled a large catalog of known massive stars in M 31 and M 33 using IR (Spitzer) and optical (Pan-STARRS) photometry, as well as Gaia astrometric information, which helps with foreground source detection. We grouped them into seven classes (Blue, Red, Yellow, B[e] supergiants, luminous blue variables, Wolf-Rayet stars, and outliers, e.g., quasi-stellar objects and background galaxies). As this training set is highly imbalanced, we implemented synthetic data generation to populate the underrepresented classes and improve separation by undersampling the majority class. We built an ensemble classifier utilizing color indices as features. The probabilities from three machine-learning algorithms (Support Vector Classification, Random Forest, and Multilayer Perceptron) were combined to obtain the final classification.
Results. The overall weighted balanced accuracy of the classifier is ∼83%. Red supergiants are always recovered at ∼94%. Blue and Yellow supergiants, B[e] supergiants, and background galaxies achieve ∼50 - 80%. Wolf-Rayet sources are detected at ∼45%, while luminous blue variables are recovered at ∼30% from one method mainly. This is primarily due to the small sample sizes of these classes. In addition, the mixing of spectral types, as there are no strict boundaries in the features space (color indices) between those classes, complicates the classification. In an independent application of the classifier to other galaxies (IC 1613, WLM, and Sextans A), we obtained an overall accuracy of ∼70%. This discrepancy is attributed to the different metallicity and extinction effects of the host galaxies. Motivated by the presence of missing values, we investigated the impact of missing data imputation using a simple replacement with mean values and an iterative imputer, which proved to be more capable. We also investigated the feature importance to find that r - i and y - [3.6] are the most important, although different classes are sensitive to different features (with potential improvement with additional features).
Conclusions. The prediction capability of the classifier is limited by the available number of sources per class (which corresponds to the sampling of their feature space), reflecting the rarity of these objects and the possible physical links between these massive star phases. Our methodology is also efficient in correctly classifying sources with missing data as well as at lower metallicities (with some accuracy loss), making it an excellent tool for accentuating interesting objects and prioritizing targets for observations.

Abstract Copyright: © ESO 2022

Journal keyword(s): stars: massive - stars: mass-loss - stars: evolution - galaxies: individual: WLM, M 31, IC 1613, M 33, Sextans A - methods: statistical

VizieR on-line data: <Available at CDS (J/A+A/666/A122): tablea1.dat>

Status at CDS : All or part of tables of objects will not be ingested in SIMBAD.

Simbad objects: 9

goto Full paper

goto View the references in ADS

Number of rows : 9
N Identifier Otype ICRS (J2000)
RA
ICRS (J2000)
DEC
Mag U Mag B Mag V Mag R Mag I Sp type #ref
1850 - 2024
#notes
1 NAME Wolf-Lundmark-Melotte G 00 01 57.9 -15 27 50   11.50 11.10 10.93   ~ 692 2
2 M 110 GiG 00 40 22.0572349992 +41 41 07.507220136   8.92 8.07     ~ 1310 1
3 M 32 GiG 00 42 41.82480 +40 51 54.6120 9.51 9.03 8.08     ~ 2155 2
4 M 31 AGN 00 42 44.330 +41 16 07.50 4.86 4.36 3.44     ~ 12678 1
5 IC 1613 GiC 01 04 48.4071 +02 07 10.185   10.42 10.01 9.77   ~ 1236 2
6 M 33 GiG 01 33 50.8965749232 +30 39 36.630403128 6.17 6.27 5.72     ~ 5847 1
7 NAME Magellanic Clouds GrG 03 00 -71.0           ~ 7084 0
8 NAME Sex A H2G 10 11 00.5 -04 41 30 12.48 12.13 11.93 11.78   ~ 727 2
9 NAME Local Group GrG ~ ~           ~ 8415 0

To bookmark this query, right click on this link: simbad:objects in 2022A&A...666A.122M and select 'bookmark this link' or equivalent in the popup menu