I am a researcher working at the intersection of survey methodology, missing data, Bayesian modelling, simulation studies and applied machine learning.
My work focuses on statistical modelling for incomplete longitudinal and survey data. I am particularly interested in model and variable selection, random-effects probit models, Bayesian computation, imputation strategies, and predictive benchmarking under different missing-data mechanisms.
- Missing data in longitudinal and survey data
- Bayesian model and variable selection
- Random-effects probit models and latent-variable models
- Simulation studies for methodological evaluation
- Machine-learning benchmarks under MCAR, MAR and imputation settings
- Survey participation, nonresponse and mode-assignment models
- Reproducible research with R and Julia
Beyond methodological research, I am interested in evidence-based decision-making in public administration, local governance and regional development. I am particularly interested in how statistical modelling, survey data and administrative data can support policy-relevant questions in education, sustainability, housing, infrastructure and municipal planning.
A simulation-based benchmark comparing predictive methods under different missing-data mechanisms and data-handling strategies. The project evaluates models such as Elastic Net, LASSO, Ridge Regression, BART and XGBoost using metrics such as AUC, RMSE, precision, recall and F1 score.
Research code for Bayesian variable and model selection in probit models with missing covariates, including Gibbs sampling, spike-and-slab priors, latent-variable augmentation and imputation strategies.
Modelling workflows for analysing participation processes in panel and survey settings, including testing participation, interview participation and follow-up web survey participation.
- R for statistical modelling, simulation studies, data wrangling and reproducible analysis
- Julia for simulation-based Bayesian computation and computationally intensive workflows
- Stan for custom Bayesian models and multivariate latent-variable models
The public repositories use simulated or synthetic example data whenever original data are restricted, confidential or institutionally protected. The aim is to make the modelling strategies, simulation designs and evaluation workflows transparent without disclosing sensitive data.
For academic or professional inquiries, please contact me via my institutional profile, ORCID, LinkedIn or email.