Insights | Md. Abu Bokkor Shiddik

Machine Learning 📅 10 April 2026 ⏱ 6 min

Threshold Selection in Outbreak Prediction Models

Outbreak prediction models generate probability outputs representing risk of disease occurrence. These probabilities cannot be directly used for decision making — they must be converted into binary outcomes using a threshold, which strongly affects how early we detect outbreaks and how many false alarms we generate.

Md. Abu Bokkor Shiddik

Researcher · Statistics, BRUR

What is a Threshold?

A classification threshold is the probability cutoff point above which a model predicts "outbreak" and below which it predicts "no outbreak." The default threshold of 0.5 assumes equally likely classes — a poor assumption for rare disease outbreaks where case counts are far less common than non-cases.

Why Threshold Matters More Than Model Accuracy

In outbreak surveillance, the cost of missing a true outbreak (false negative) is usually far higher than the cost of a false alarm (false positive). A model with 95% accuracy can still fail catastrophically if the threshold is wrong — triggering no alerts during the early phase of an epidemic when probabilities hover around 0.3.

Threshold Strategy	Effect	Best Used When
Precision-Recall optimised	Balances detection vs precision	Imbalanced outbreak data
Cost-sensitive	Minimises real-world loss function	Miss cost >> false alarm cost
Seasonal adaptive	Changes threshold by season	Climate-sensitive diseases (dengue, cholera)
Spatial adaptive	Region-specific cutoff	Multi-region surveillance systems
Percentile-based	Flags top-risk observations	Early warning dashboards
Default 0.5	Assumes balanced classes	Balanced datasets only — NOT outbreaks

Practical Recommendation

For outbreak prediction: (1) always use the Precision-Recall curve rather than ROC for threshold selection; (2) compute the F-beta score with β > 1 to weight recall higher than precision; (3) validate the selected threshold on held-out surveillance data from at least two epidemic seasons before deployment. The threshold is not a model parameter — it is a policy decision that should involve epidemiologists alongside data scientists.

Public Health 📅 6 April 2026 ⏱ 5 min

Understanding GATHER and Making Health Estimates Transparent

GATHER (Guidelines for Accurate and Transparent Health Estimates Reporting), introduced in 2016, provides a structured 18-item reporting standard ensuring health estimates are reproducible, clear, and accompanied by uncertainty quantification.

Md. Abu Bokkor Shiddik

Researcher · Statistics, BRUR

Why Transparent Reporting Matters

Health estimates — national disease burden, mortality rates, future projections — directly drive policy decisions worth billions of dollars and millions of lives. Without a reporting standard, results may be unverifiable, selectively presented, or impossible to reproduce.

The 18-Item GATHER Checklist (Summary)

Domain	Items	Purpose
Study Definition	Population, outcomes, time period, geography	Defines the scope of the estimate
Data Inputs	Sources, access method, inclusion/exclusion criteria	Ensures traceability and replicability
Data Processing	Cleaning, adjustment, availability	Enables independent verification
Statistical Model	Framework, methods, assumptions, justification	Explains how estimates were produced
Validation & Results	Model checking, point estimates, uncertainty intervals	Assesses reliability and communicates uncertainty
Transparency	Code sharing, data sharing	Supports full reproducibility

The Uncertainty Imperative

GATHER specifically requires that uncertainty intervals accompany every estimate. Reporting "prevalence = 12.5%" is scientifically incomplete. The correct form is: 12.5% (95% UI: 10.2%–14.8%). The interval communicates how much confidence we can place in the point estimate, which is essential for policy-makers weighing intervention costs against disease burden.

Key TakeawayGATHER is not a statistical method — it is a scientific communication standard. Following it does not change your analysis; it makes your analysis trustworthy to others.

Machine Learning 📅 6 April 2026 ⏱ 4 min

Normalization vs. Standardization in Data Preprocessing

Features with different units and scales distort distance-based and gradient-based models. Normalization and standardization are the two primary scaling techniques — but they work differently and suit different scenarios. Choosing the wrong one quietly degrades model performance.

Md. Abu Bokkor Shiddik

Researcher · Statistics, BRUR

Normalization (Min-Max Scaling)

Rescales all values to fall in [0, 1]: X_norm = (X − X_min) / (X_max − X_min). Preserves the shape of the original distribution. Best for: k-NN, K-Means, neural networks requiring bounded inputs. Fatal weakness: a single outlier can compress all other values into a tiny range.

Standardization (Z-score Scaling)

Transforms to mean = 0, std = 1: X_std = (X − μ) / σ. Does not bound the range but is much more robust to outliers. Best for: PCA, SVM, logistic regression, linear discriminant analysis — any method that assumes or benefits from Gaussian-like feature distributions.

Property	Normalization	Standardization
Output range	[0, 1]	Unbounded (typically −3 to +3)
Outlier sensitivity	Very high	Moderate
Preserves shape	Yes	Yes
Best algorithms	k-NN, K-Means, Neural Nets	PCA, SVM, Logistic Reg, LDA
Use when Gaussian?	Not required	Preferred

The Golden Rule

Always fit the scaler on training data only. Apply the fitted scaler (same μ, σ or min, max) to test and validation data. Fitting on test data causes data leakage — your model will appear better than it actually is on new data.

Quick Decision RuleUsing tree-based models (Random Forest, XGBoost)? → No scaling needed. Distance-based? → Normalize. Statistics/regression/PCA? → Standardize. Neural networks? → Either, but normalize for image pixels.

Epidemiology 📅 18 March 2026 ⏱ 7 min

Fundamentals of Epidemiologic Research Design

Epidemiology is the study of how disease is distributed in populations and what determines that distribution. Choosing the right study design — observational or experimental — is the most consequential methodological decision a researcher makes, because different designs answer fundamentally different causal questions.

Md. Abu Bokkor Shiddik

Researcher · Statistics, BRUR

The Epidemiologic Triad

Every epidemiologic study investigates the relationship between an exposure (risk factor, treatment, environment) and an outcome (disease, death, recovery) in a defined population. The fundamental measures are incidence (new cases per person-time), prevalence (existing cases at a point in time), and risk (probability of developing disease).

Observational Study Designs

Design	Direction	Measure	Best for	Key Limitation
Cohort (prospective)	Exposure → Outcome	Risk Ratio, Rate Ratio	Rare exposures, incidence estimation	Expensive, long follow-up
Case-Control	Outcome → Exposure	Odds Ratio	Rare outcomes, quick & cheap	Recall bias, selection bias
Cross-sectional	Simultaneous	Prevalence Ratio	Prevalence, hypothesis generation	Cannot establish temporality
Ecological	Group-level	Correlation	Policy-level analysis	Ecological fallacy

Randomised Controlled Trial (RCT) — The Gold Standard

Random allocation of participants to treatment/control eliminates confounding by design — the unknown lurking variables that bias observational studies. However, RCTs are expensive, sometimes unethical (cannot randomise people to smoke), and have limited external validity (trial populations differ from real-world patients).

Measures of Association

Risk Ratio (RR): RR = Risk_exposed / Risk_unexposed. RR = 1 → no association; RR > 1 → positive association; RR < 1 → protective. Odds Ratio (OR): OR = (a/b) / (c/d) in a 2×2 table. When outcome is rare, OR ≈ RR. Attributable Risk (AR): Risk_exposed − Risk_unexposed → how much disease is attributable to the exposure in absolute terms.

Hierarchy of EvidenceSystematic review/meta-analysis → RCT → Prospective cohort → Case-control → Cross-sectional → Case report. Higher is stronger causal evidence, but observational studies remain essential for questions where trials are impossible or unethical.

Climate Health 📅 5 March 2026 ⏱ 6 min

Climate Change and Its Effects on Human Health

The World Health Organization estimates that climate change will cause 250,000 additional deaths per year between 2030–2050 from malnutrition, malaria, diarrhoea, and heat stress alone. The pathways are multiple and intersecting: temperature, extreme events, vector ecology, food systems, and mental health.

Md. Abu Bokkor Shiddik

Researcher · Statistics, BRUR

Pathways from Climate to Health

Climate change affects health through direct pathways (heat stress, extreme weather injuries, UV exposure) and indirect pathways (vector-borne diseases, food insecurity, water contamination, displacement, mental health). The indirect pathways are more complex and harder to model, but often cause greater total morbidity.

Climate Driver	Health Outcome	Vulnerable Population
Rising temperatures	Heat stroke, cardiovascular stress, preterm birth	Elderly, outdoor workers, pregnant women
Vector habitat expansion	Dengue, malaria, chikungunya shifting northward	Previously non-endemic regions
Flooding & storms	Diarrhoeal disease, leptospirosis, trauma, displacement	Coastal and low-lying communities
Drought & crop failure	Malnutrition, stunting, child mortality	Subsistence farmers in South Asia, Africa
Air quality degradation	COPD, asthma exacerbation, lung cancer	Urban poor, children

Bangladesh as a Case Study

Bangladesh is ranked among the most climate-vulnerable countries. Cyclone Amphan (2020) displaced 2.4 million people. Annual flooding submerges 20–25% of the country, contaminating tube wells and triggering diarrhoeal outbreaks. Average temperatures have risen 0.5°C since 1960, extending the dengue transmission season by approximately 3 weeks per decade. These are not future projections — they are present realities demanding urgent public health action.

Climate-Health Research Methods

Key analytical tools include: distributed lag non-linear models (DLNM) for modelling temperature-mortality relationships; time-series analysis linking weather patterns to disease surveillance data; spatial epidemiology mapping disease burden changes with climate projections; and scenario modelling using IPCC pathways (SSP2-4.5, SSP5-8.5) to project future health impacts.

Climate Health 📅 20 February 2026 ⏱ 5 min

One Health: Bridging Human, Animal, and Ecosystem Health

Approximately 75% of emerging infectious diseases are zoonotic — originating in animals before jumping to humans. One Health is the integrated approach that unites human medicine, veterinary science, and environmental health to prevent pandemics at their source rather than respond after spillover.

Md. Abu Bokkor Shiddik

Researcher · Statistics, BRUR

What is One Health?

The One Health concept, formally endorsed by the WHO, FAO, UNEP, and WOAH (the "Quadripartite"), recognises that human health cannot be protected in isolation from animal health and healthy ecosystems. The three domains are not parallel — they overlap, interact, and mutually determine each other. Deforestation creates human-wildlife interfaces; factory farming breeds antimicrobial resistance; wetland destruction eliminates natural barriers to vector proliferation.

Classic One Health Examples

Disease	Animal Reservoir	Ecosystem Driver	Human Impact
COVID-19	Bats (likely)	Wildlife trade, urbanisation	Global pandemic, millions of deaths
Nipah virus	Fruit bats	Deforestation, mango farming	Outbreaks in Bangladesh annually
Avian influenza H5N1	Wild birds, poultry	Live bird markets, migration routes	High CFR (>60%) in humans
Antimicrobial resistance	Livestock (all species)	Agriculture overuse of antibiotics	700,000 deaths/year; projected 10M by 2050

One Health in Bangladesh

Bangladesh has experienced multiple Nipah virus outbreaks traced to raw date palm sap contaminated by bat urine — a direct human-animal-environment interface. The response required simultaneous action in public health (case detection), veterinary surveillance (bat monitoring), and environmental management (sap collection practices). This is One Health in practice: no single sector could have solved it alone.

Research ImplicationOne Health research requires interdisciplinary teams — epidemiologists, veterinarians, ecologists, anthropologists, and data scientists. Methods include zoonotic disease modelling, genomic surveillance of pathogens across species, and network analysis of human-animal interfaces.

Explainable AI 📅 10 February 2026 ⏱ 7 min

Explainable AI: Making Black-Box Models Interpretable in Healthcare

Deep learning models achieve superhuman accuracy on medical imaging tasks — yet clinicians cannot use them safely without understanding why a prediction was made. Explainable AI (XAI) bridges this trust gap by providing model-agnostic or model-specific explanations that connect predictions to clinical reasoning.

Md. Abu Bokkor Shiddik

Researcher · Statistics, BRUR

The Black-Box Problem in Clinical AI

A neural network predicting sepsis risk from 48 hours of ICU vital signs might achieve AUROC = 0.91 — but if the model triggers an alert, the clinician needs to know: which features drove this prediction? Was it lactate, temperature, or a subtle pattern across multiple vitals? Without this, clinicians cannot verify clinical plausibility, catch spurious correlations, or take targeted action.

Key XAI Methods

Method	Type	How it Works	Best For
SHAP (SHapley Additive exPlanations)	Model-agnostic	Attributes prediction to each feature using game-theoretic Shapley values	Tabular data, global + local explanations
LIME	Model-agnostic	Fits a local linear model in the neighbourhood of each prediction	Any model, quick local explanations
Grad-CAM	CNN-specific	Highlights image regions driving the classification using gradient flow	Medical imaging (X-ray, MRI, pathology)
Attention Weights	Transformer-specific	Visualises which tokens/timepoints the model attends to	NLP clinical notes, time-series EHR
Integrated Gradients	Deep learning	Attributes prediction to inputs by integrating gradients from baseline to input	Genomics, EHR, images

SHAP in Practice: Dengue Severity Prediction

In a study predicting severe dengue from clinical features at admission, SHAP values revealed that platelet count and haematocrit rise were the dominant predictors — consistent with clinical knowledge of dengue pathophysiology. Crucially, SHAP also flagged that the model was partly using "hospital ID" as a proxy feature — a spurious correlation that would fail catastrophically at a new site. XAI caught what accuracy metrics could not.

Regulatory and Ethical Dimensions

The EU AI Act (2024) classifies clinical AI as high-risk and requires explainability documentation. The FDA's 2021 AI/ML guidance similarly emphasises transparency and traceability. Explainability is no longer optional — it is a regulatory and ethical requirement for clinical deployment.

Mental Health 📅 15 January 2026 ⏱ 5 min

Mental Health in the Era of Climate Anxiety and Ecological Grief

A 2021 global survey of 10,000 young people found that 59% were very or extremely worried about climate change, and 45% said their feelings about it negatively affected their daily functioning. Climate distress is emerging as a significant mental health burden requiring recognition, frameworks, and clinical response.

Md. Abu Bokkor Shiddik

Researcher · Statistics, BRUR

The Emerging Taxonomy of Climate-Related Distress

Eco-anxiety: Chronic fear of environmental doom and catastrophe. Solastalgia: Grief arising from the degradation of one's home environment (coined by philosopher Glenn Albrecht). Ecological grief: Mourning the loss of species, ecosystems, and places. Climate trauma: PTSD-like symptoms following direct exposure to extreme weather events such as cyclones, floods, or wildfires. These are not disorders of irrational thinking — they are rational emotional responses to real threats.

Epidemiology of Climate Mental Health

Post-disaster mental health studies consistently show elevated rates of depression, PTSD, and anxiety following climate events. After Cyclone Sidr (2007) in Bangladesh, researchers found PTSD prevalence of 31% among directly affected individuals one year post-disaster. After the 2022 Pakistan floods — the worst in its history — mental health services were overwhelmed simultaneously with physical trauma care.

Differential Vulnerability

Climate mental health impacts are not equally distributed. Young people (who face the longest future of climate impacts) show the highest eco-anxiety rates. Farmers and fisherfolk in climate-vulnerable livelihoods experience the highest rates of chronic stress and depression. Indigenous communities suffer unique solastalgia from the loss of culturally significant landscapes. Women in disaster-affected areas face compounded risks due to caregiving burdens and reduced autonomy.

Research and Clinical Responses

The field needs validated screening tools (the Climate Change Worry Scale, the Climate Distress Scale), longitudinal cohort studies linking climate exposures to mental health trajectories, and climate-informed psychotherapy adaptations. Crucially, addressing eco-anxiety is not merely therapeutic — it is also a driver of climate action. Channelling distress into meaningful engagement reduces paralysis and builds resilience.

Health Policy 📅 5 January 2026 ⏱ 6 min

Health Inequality: Understanding the Social Determinants of Disease

The social determinants of health — income, education, housing, employment, neighbourhood environment — explain more of the variation in population health outcomes than medical care does. Addressing health inequality requires upstream policy action, not just more clinics.

Md. Abu Bokkor Shiddik

Researcher · Statistics, BRUR

What are Social Determinants of Health (SDH)?

The WHO Commission on Social Determinants of Health (Marmot Commission) defines SDH as "the conditions in which people are born, grow, live, work and age." These structural conditions — income distribution, educational access, housing quality, food security, occupational safety, social protection — determine health long before a person ever enters a clinic.

The Social Gradient of Health

Health outcomes follow a near-continuous social gradient: with each step down the socioeconomic ladder, health worsens. This is not simply a binary "poor vs rich" effect — managers have worse health than executives; clerks have worse health than managers. The Whitehall Studies of British civil servants demonstrated this gradient across an employed, non-destitute population — ruling out absolute deprivation as the sole explanation.

Measurement in Bangladesh

SDH Domain	Metric	Bangladesh Inequality Gap
Income	Under-5 mortality	Poorest quintile: 65/1000; Richest: 20/1000
Education	Stunting prevalence	No education mothers: 47%; Higher education: 21%
Geography	Skilled birth attendance	Urban: 74%; Rural: 41%
Gender	Anaemia	Women: 36%; Men: 15%

Policy Implications: Upstream vs Downstream

Downstream interventions (treating sick people) are necessary but insufficient. Upstream interventions — progressive taxation, housing standards, universal education, social protection — produce the largest and most equitable health gains per taka invested. The economic case is strong: the Marmot Review estimated that health inequalities cost England £31–33 billion annually in lost productivity. Reducing inequality is not just a moral imperative; it is economically rational.

Public Health 📅 20 December 2025 ⏱ 6 min

The Concept of Spillover Effects in Health Policy Evaluation

Standard RCT analysis assumes that a treated individual's outcomes depend only on their own treatment status (SUTVA — Stable Unit Treatment Value Assumption). In public health, this assumption is routinely violated: vaccine coverage protects the unvaccinated; deworming treated children improves untreated classmates; mental health programs change household dynamics. These spillovers are the rule, not the exception.

Md. Abu Bokkor Shiddik

Researcher · Statistics, BRUR

What is SUTVA and Why Does it Fail?

The Stable Unit Treatment Value Assumption (SUTVA) requires that (1) there is only one version of each treatment and (2) potential outcomes for any unit are unaffected by the treatment of other units. In infectious disease control, (2) is obviously false: vaccinating your neighbour reduces your infection risk. In nutrition programs, treating a sibling changes household food allocation. In mental health interventions, treating a depressed parent changes outcomes for untreated children.

Types of Spillover Effects

Type	Mechanism	Direction	Public Health Example
Herd immunity	Reduced pathogen circulation	Positive	Vaccine coverage protects unvaccinated
Behavioural spillover	Social norms change	Positive/Negative	Hand-washing programs change untreated neighbours
Resource reallocation	Household budget effects	Positive/Negative	Cash transfer to mother improves sibling nutrition
General equilibrium	Market/labour market changes	Often positive	Deworming increases wages for untreated workers
Congestion	Overcrowded services	Negative	Large vaccine campaign overwhelms clinics

Methods for Estimating Spillovers

Two-stage randomisation: Randomise the proportion treated within clusters (villages, schools), then randomise individuals within clusters. Compare outcomes for untreated individuals in high-coverage vs low-coverage clusters. Network-based approaches: Model spillovers through social networks, estimating how treatment of network-connected individuals affects outcomes. Geographic regression discontinuity: Compare outcomes at boundaries of treated and untreated areas.

The Kenya Deworming Controversy

The famous Kenya Primary School Deworming Project (Miguel & Kremer, 2004) showed enormous positive externalities — the benefit to untreated children in treated schools was so large that the total social benefit was many times larger than the direct benefit to treated children. This spillover-inclusive evaluation completely changed the cost-effectiveness of deworming, making it one of the most cost-effective development interventions ever measured. Ignoring spillovers can therefore produce catastrophically misleading policy guidance in either direction.

Health Policy 📅 1 December 2025 ⏱ 5 min

Health Inequality and Climate Change: Intersecting Crises

Sub-Saharan Africa contributes 3% of global cumulative CO₂ emissions but bears 25% of climate-related disease burden. Bangladesh emits 0.5 tonnes of CO₂ per capita vs 14.7 tonnes in the USA — yet faces existential climate risks. This paper examines the climate-inequality nexus and its implications for global health equity.

Md. Abu Bokkor Shiddik

Researcher · Statistics, BRUR

The Double Injustice of Climate and Health

The communities with the smallest carbon footprints face the largest climate health burdens. This is a double injustice: they bear costs they did not create, AND their lower adaptive capacity means they suffer worse health consequences per unit of climate exposure. A Bangladeshi farmer facing monsoon floods has fewer options — financial, geographic, informational — than a Dutch farmer facing the same flood risk, because the Netherlands has invested centuries of wealth into flood protection infrastructure.

Mechanisms of Compounding Disadvantage

Occupational exposure: The poor are disproportionately employed in outdoor, climate-exposed work (agriculture, construction, fishing) with no option to work remotely or in air-conditioned environments. Housing quality: Informal settlements without adequate insulation, cooling, or storm protection amplify heat and flood risk. Healthcare access: When climate disasters strike, the poorest are furthest from functioning health infrastructure. Nutrition: Climate crop failure first hits subsistence farmers, not supermarket shoppers.

Loss and Damage: A Policy Framework

The historic COP27 (2022) agreement to establish a Loss and Damage fund acknowledged that beyond mitigation and adaptation, there are irreversible climate harms — loss of land, lives, cultural heritage — requiring compensation. This is the first formal recognition of climate reparations in international law and has direct implications for health financing in vulnerable nations.

Research PriorityDeveloping country researchers must lead climate-health equity research — they understand local contexts, have access to disaggregated data, and bear personal witness to the realities being studied.