Using machine learning to improve neutron identification in water Cherenkov detectors

Blair Jamieson; Matt Stubbs; Sheela Ramanna; John Walker; Nick Prouse; Ryosuke Akutsu; Patrick de Perio; Wojciech Fedorko

doi:10.3389/fdata.2022.978857

Using machine learning to improve neutron identification in water Cherenkov detectors

Front Big Data. 2022 Sep 30:5:978857. doi: 10.3389/fdata.2022.978857. eCollection 2022.

Authors

Blair Jamieson¹, Matt Stubbs², Sheela Ramanna², John Walker^{1

3}, Nick Prouse³, Ryosuke Akutsu³, Patrick de Perio^{3

4}, Wojciech Fedorko³

Affiliations

¹ Physics Department, University of Winnipeg, Winnipeg, MB, Canada.
² Applied Computer Science Department, University of Winnipeg, Winnipeg, MB, Canada.
³ Science Division, TRIUMF, Vancouver, BC, Canada.
⁴ Kavli IPMU (WPI), UTIAS, The University of Tokyo, Tokyo, Japan.

Abstract

Water Cherenkov detectors like Super-Kamiokande, and the next generation Hyper-Kamiokande are adding gadolinium to their water to improve the detection of neutrons. By detecting neutrons in addition to the leptons in neutrino interactions, an improved separation between neutrino and anti-neutrinos, and reduced backgrounds for proton decay searches can be expected. The neutron signal itself is still small and can be confused with muon spallation and other background sources. In this paper, machine learning techniques are employed to optimize the neutron capture detection capability in the new intermediate water Cherenkov detector (IWCD) for Hyper-K. In particular, boosted decision tree (XGBoost), graph convolutional network (GCN), and dynamic graph convolutional neural network (DGCNN) models are developed and benchmarked against a statistical likelihood-based approach, achieving up to a 10% increase in classification accuracy. Characteristic features are also engineered from the datasets and analyzed using SHAP (SHapley Additive exPlanations) to provide insight into the pivotal factors influencing event type outcomes. The dataset used in this research consisted of roughly 1.6 million simulated particle gun events, divided nearly evenly between neutron capture and a background electron source. The current samples used for training are representative only, and more realistic samples will need to be made for the analyses of real data. The current class split is 50/50, but there is expected to be a difference between the classes in the real experiment, and one might consider using resampling techniques to address the issue of serious imbalances in the class distribution in real data if necessary.

Keywords: graph neural networks; machine learning; neutrino physics; particle physics; water Cherenkov detector.