Enze Xu

PUBLICATIONS

Full publication list can be found on Google Scholar.
^* indicates equal contribution.

A multiscale model to explain the spatiotemporal progression of amyloid beta and tau pathology in Alzheimer's disease
Chunrui Xu, Enze Xu*, Yang Xiao, Defu Yang, Guorong Wu, Minghan Chen
International Journal of Biological Macromolecules. 2025.
[Abstract] [Paper Link]

Amyloid-beta (Aβ) and tubulin-associated unit (tau) proteins are key biomarkers of Alzheimer's disease (AD), detectable by Positron Emission Tomography (PET) imaging and Cerebrospinal Fluid (CSF) assays. They reflect insoluble fibrils in the brain and soluble monomers in the cerebrospinal fluid, respectively. PET and CSF biomarkers have been utilized in diagnosing AD; however, their incomplete agreement significantly confounds the early detection. Additionally, the molecular mechanisms underlying the dynamics of AD biomarkers remain elusive and are yet to be quantitatively revealed. To answer these questions, we develop a multiscale mathematical model that characterizes various forms of AD biomarkers, including soluble molecules in cerebrospinal fluid, diffusive biomarkers across brain regions, and insoluble fibrils in the brain. Mathematical modeling of soluble and insoluble molecules enables the explanation of the asynchronous trajectory of AD biomarkers. Our model captures the spatiotemporal dynamics of Aβ and tau with neurodegeneration in AD. Simulation results demonstrate that the PET-CSF discordance is a typical stage in the natural history of protein aggregation, with CSF becoming abnormal before the onset of PET abnormality. Furthermore, correlation analysis reveals that neurodegeneration is more strongly associated with tau-PET than Aβ-PET. These findings suggest CSF Aβ is recognized as a biomarker at the early stage of AD, while tau-PET is more suitable for neurodegeneration assessment. The proposed multiscale model explains the underlying neurobiological factors contributing to neurodegeneration and offers a valuable tool for improving early detection and treatment strategies in clinical trials.

Chemical Environment Adaptive Learning for Optical Band Gap Prediction of Doped Graphitic Carbon Nitride Nanosheets
Chen Chen*, Enze Xu*, Defu Yang, Chenggang Yan, Tao Wei, Hanning Chen, Yong Wei, Minghan Chen
Neural Computing and Applications. 2025.
[Abstract] [Paper Link]

This study presents a novel Machine Learning Algorithm, named Chemical Environment Graph Neural Network (ChemGNN), designed to accelerate materials property prediction and advance new materials discovery. Graphitic carbon nitride (g-C3N4) and its doped variants have gained significant interest for their potential as optical materials. Accurate prediction of their band gaps is crucial for practical applications, however, traditional quantum simulation methods are computationally expensive and challenging to explore the vast space of possible doped molecular structures. The proposed ChemGNN leverages the learning ability of current graph neural networks (GNNs) to satisfactorily capture the characteristics of atoms' local chemical environment underlying complex molecular structures. Our benchmark results demonstrate more than 100% improvement in band gap prediction accuracy over existing GNNs on g-C3N4. Furthermore, the general ChemGNN model can precisely foresee band gaps of various doped g-C3N4 structures, making it a valuable tool for performing high-throughput prediction in materials design and development.

Multiscale Attention Wavelet Neural Operator for Capturing Steep Trajectories in Biochemical Systems
Jiayang Su, Junbo Ma, Songyang Tong, Enze Xu, Minghan Chen
Proceedings of the AAAI Conference on Artificial Intelligence. 2024.
[Abstract] [Paper Link]

In biochemical modeling, some foundational systems can exhibit sudden and profound behavioral shifts, such as the cellular signaling pathway models, in which the physiological responses promptly react to environmental changes, resulting in steep changes in their dynamic model trajectories. These steep changes are one of the major challenges in biochemical modeling governed by nonlinear differential equations. One promising way to tackle this challenge is converting the input data from the time domain to the frequency domain through Fourier Neural Operators, which enhances the ability to analyze data periodicity and regularity. However, the effectiveness of these Fourier based methods diminishes in scenarios with complex abrupt switches. To address this limitation, an innovative Multiscale Attention Wavelet Neural Operator (MAWNO) method is proposed in this paper, which comprehensively combines the attention mechanism with the versatile wavelet transforms to effectively capture these abrupt switches. Specifically, the wavelet transform scrutinizes data across multiple scales to extract the characteristics of abrupt signals into wavelet coefficients, while the self-attention mechanism is adeptly introduced to enhance the wavelet coefficients in high-frequency signals that can better characterize the abrupt switches. Experimental results substantiate MAWNO's supremacy in terms of accuracy on three classical biochemical models featuring periodic and steep trajectories. https://github.com/SUDERS/MAWNO.

Pathology Steered Stratification Network for Subtype Identification in Alzheimer's Disease
Enze Xu*, Jingwen Zhang*, Jiadi Li, Qianqian Song, Defu Yang, Guorong Wu, Minghan Chen
Medical Physics. 2024.
[Abstract] [Paper Link]

[Background] Alzheimer's disease (AD) is a heterogeneous, multifactorial neurodegenerative disorder characterized by three neurobiological factors beta-amyloid, pathologic tau, and neurodegeneration. There are no effective treatments for AD at a late stage, urging for early detection and prevention. However, existing statistical inference approaches in neuroimaging studies of AD subtype identification do not take into account the pathological domain knowledge, which could lead to ill-posed results that are sometimes inconsistent with the essential neurological principles. [Purpose] Integrating systems biology modeling with machine learning, the study aims to assist clinical AD prognosis by providing a subpopulation classification in accordance with essential biological principles, neurological patterns, and cognitive symptoms. [Methods] We propose a novel pathology steered stratification network (PSSN) that incorporates established domain knowledge in AD pathology through a reaction-diffusion model, where we consider non-linear interactions between major biomarkers and diffusion along the brain structural network. Trained on longitudinal multimodal neuroimaging data, the biological model predicts long-term evolution trajectories that capture individual characteristic progression pattern, filling in the gaps between sparse imaging data available. A deep predictive neural network is then built to exploit spatiotemporal dynamics, link neurological examinations with clinical profiles, and generate subtype assignment probability on an individual basis. We further identify an evolutionary disease graph to quantify subtype transition probabilities through extensive simulations. [Results] Our stratification achieves superior performance in both inter-cluster heterogeneity and intra-cluster homogeneity of various clinical scores. Applying our approach to enriched samples of aging populations, we identify six subtypes spanning AD spectrum, where each subtype exhibits a distinctive biomarker pattern that is consistent with its clinical outcome. [Conclusions] The proposed PSSN (i) reduces neuroimage data to low-dimensional feature vectors, (ii) combines AT[N]-Net based on real pathological pathways, (iii) predicts long-term biomarker trajectories, and (iv) stratifies subjects into fine-grained subtypes with distinct neurological underpinnings. PSSN provides insights into pre-symptomatic diagnosis and practical guidance on clinical treatments, which may be further generalized to other neurodegenerative diseases.

Modeling of AMPK Regulatory Network in Alzheimer's Disease
Junsheng Wang, Enze Xu, Yang Xiao, Chunrui Xu, Minghan Chen
2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2023.
[Abstract] [Paper Link]

AMP-activated protein kinase (AMPK), a cellular energy sensor in response to changes in the ADP/ATP ratio, has been proven to not only play roles in metabolic actions but also holds pivotal roles in neurodegenerative diseases such as Alzheimer's disease (AD), Parkinson's disease, and Lewy body dementia. However, the downstream effectors regulated by AMPK and its specific contributions to the pathology of these diseases remain elusive. In this study, we combine the power of systems biology and neuroscience to understand the dynamics and regulatory effects of AMPK under the progression of AD. Particularly, our study develops a regulatory protein network that seeks to demystify this intricate protein-protein interaction, placing particular emphasis on the role of AMPK in the pathology of AD. By focusing on two important cellular activities, mRNA translation and autophagy, our model explores the underlying effects of the central governance of a change in AMPK activity and concludes that eEF2-controlled translation is more sensitive to the perturbation. Additionally, we adjust various kinetic parameters to restore proteins affected by the disease to their baseline levels, with the goal of identifying potential new drug targets. The model accurately captures the regulatory effect of AMPK, provides the temporal dynamics of key regulators in AD progression, contributes to the understanding of the disease's pathophysiology, and potentially generalizes the function of AMPK into other neurodegenerative diseases.

Graph Clustering Analyses of Discontinuous Molecular Dynamics Simulations: Study of Lysozyme Adsorption on a Graphene Surface
Jing Chen, Enze Xu, Yong Wei*, Minghan Chen*, Tao Wei*, Size Zheng*
Langmuir 38.35 (2022): 10817-10825.
[Abstract] [Paper Link]

Understanding the interfacial behaviors of biomolecules is crucial to applications in biomaterials and nanoparticle-based biosensing technologies. In this work, we utilized autoencoder-based graph clustering to analyze discontinuous molecular dynamics (DMD) simulations of lysozyme adsorption on a graphene surface. Our high-throughput DMD simulations integrated with a Go̅-like protein–surface interaction model makes it possible to explore protein adsorption at a large temporal scale with sufficient accuracy. The graph autoencoder extracts a low-dimensional feature vector from a contact map. The sequence of the extracted feature vectors is then clustered, and thus the evolution of the protein molecule structure in the absorption process is segmented into stages. Our study demonstrated that the residue–surface hydrophobic interactions and the π–π stacking interactions play key roles in the five-stage adsorption. Upon adsorption, the tertiary structure of lysozyme collapsed, and the secondary structure was also affected. The folding stages obtained by autoencoder-based graph clustering were consistent with detailed analyses of the protein structure. The combination of machine learning analysis and efficient DMD simulations developed in this work could be an important tool to study biomolecules' interfacial behaviors.

AT[N]-net: multimodal spatiotemporal network for subtype identification in Alzheimer's disease
Jingwen Zhang, Enze Xu, Minghan Chen
Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. 2022.
[Abstract] [Paper Link]

Alzheimer's disease (AD) is a heterogeneous, multifactorial neurodegenerative disorder, where beta-amyloid (A), pathologic tau (T), neurodegeneration ([N]), and structural brain network (Net) are four major indicators of AD progression. Most current studies on AD rely on single-source modality and ignore complex biological interactions at molecular level. In this study, we propose a novel multimodal spatiotemporal stratification network (MSSN) that is built upon the fusion of multiple data modalities and the combined power of systems biology and deep learning. Altogether, our stratification approach could (1) ameliorate limitations caused by insufficient longitudinal imaging data, (2) extract important spatiotemporal features vectors from imaging data, (3) exploit the subject-specific longitudinal prediction of a holistic biomarker set, and (4) generate symptoms related finegrained subtype classification.

Adaptive Request Scheduling for Device Cloud
Han Dong, Enze Xu, Xiang Jing, Huaqian Cai, Gang Huang
2020 IEEE International Conference on Services Computing (SCC). IEEE, 2020.
[Abstract] [Paper Link]

Nowadays, more and more cloud testing platforms provide enterprise developers with solutions for cloud device debugging and automatic testing. It is a great challenge for these cloud platforms to schedule the arriving requests to run on the specific smart device resources in real-time and efficiently. The traditional scheduling algorithm is difficult to adapt to the application interface call request with a vast difference in volume and behaviour ability. To solve this problem, we integrate these smart devices into a device cloud and propose a measurement method of the service capability of a single device in the group. Then we build an adaptive scheduling algorithm model according to the characteristics of the serviceability of a single device to improve the scheduling efficiency of the group. Practice shows that the adaptive scheduling algorithm can effectively control the network traffic. Finally, through the analysis and optimization, we get the method of obtaining the optimal parameter combination in the algorithm.

A First Look at Instant Service Consumption with Quick Apps on Mobile Devices
Yi Liu, Enze Xu, Yun Ma, Xuanzhe Liu
2019 IEEE International Conference on Web Services (ICWS). IEEE, 2019.
[Abstract] [Paper Link]

Mobile app ecosystem has gained giant success in providing services on mobile devices to facilitate almost all aspects in our daily life. However, the whole-package installation and dramatically increasing package size are now preventing users from trying more apps. To address the issue, many lightweight frameworks have emerged, enabling to provide the experience of instant service consumption where apps are of small size and no installation is needed to consuming services provided by the apps. In this paper, we conduct the first empirical study on instant service consumption on mobile devices. We focus on one of the most popular frameworks, quick apps, which are proposed and supported by nine mainstream mobile phone manufacturers in China. Quick apps are implemented with Web-based technologies, and run as native apps without the need of installation. We find that quick apps have much smaller size and only provide a limited set of services compared to their corresponding native apps. Then, we characterize the performance differences between quick apps and native apps in terms of launching time, data drain, and network connections, when the two kinds of apps provide the same services. Our observations reveal that quick apps perform better than native apps thanks to its much smaller size and less functionalities in a single page. Finally, we propose a machine learning based approach to helping developers construct the quick app from an existing native app.

PREPRINTS

A Generalizable Physics-Enhanced State Space Model for Long-Term Dynamics Forecasting in Complex Environments
Yuchen Wang, Hongjue Zhao, Haohong Lin, Enze Xu, Lifang He, Huajie Shao
arXiv:2507.10792.
[Abstract] [Paper Link]

This work aims to address the problem of long-term dynamic forecasting in complex environments where data are noisy and irregularly sampled. While recent studies have introduced some methods to improve prediction performance, these approaches still face a significant challenge in handling long-term extrapolation tasks under such complex scenarios. To overcome this challenge, we propose Phy-SSM, a generalizable method that integrates partial physics knowledge into state space models (SSMs) for long-term dynamics forecasting in complex environments. Our motivation is that SSMs can effectively capture long-range dependencies in sequential data and model continuous dynamical systems, while the incorporation of physics knowledge improves generalization ability. The key challenge lies in how to seamlessly incorporate partially known physics into SSMs. To achieve this, we decompose partially known system dynamics into known and unknown state matrices, which are integrated into a Phy-SSM unit. To further enhance long-term prediction performance, we introduce a physics state regularization term to make the estimated latent states align with system dynamics. Besides, we theoretically analyze the uniqueness of the solutions for our method. Extensive experiments on three real-world applications, including vehicle motion prediction, drone state prediction, and COVID-19 epidemiology forecasting, demonstrate the superior performance of Phy-SSM over the baselines in both long-term interpolation and extrapolation tasks. The code is available at this https URL.

Neural Symbolic Regression using Control Variables
Xieting Chu, Hongjue Zhao, Enze Xu, Hairong Qi, Minghan Chen, Huajie Shao
arXiv:2306.04718.
[Abstract] [Paper Link]

Symbolic regression (SR) is a powerful technique for discovering the analytical mathematical expression from data, finding various applications in natural sciences due to its good interpretability of results. However, existing methods face scalability issues when dealing with complex equations involving multiple variables. To address this challenge, we propose SRCV, a novel neural symbolic regression method that leverages control variables to enhance both accuracy and scalability. The core idea is to decompose multi-variable symbolic regression into a set of single-variable SR problems, which are then combined in a bottom-up manner. The proposed method involves a four-step process. First, we learn a data generator from observed data using deep neural networks (DNNs). Second, the data generator is used to generate samples for a certain variable by controlling the input variables. Thirdly, single-variable symbolic regression is applied to estimate the corresponding mathematical expression. Lastly, we repeat steps 2 and 3 by gradually adding variables one by one until completion. We evaluate the performance of our method on multiple benchmark datasets. Experimental results demonstrate that the proposed SRCV significantly outperforms state-of-the-art baselines in discovering mathematical expressions with multiple variables. Moreover, it can substantially reduce the search space for symbolic regression. The source code will be made publicly available upon publication.