1. Introduction to Point Process Models Context: Investigating raised incidence of rare diseases near environmental pollution sources (e.g., cancer near nuclear installations). Problem: Determining association between disease clusters and specific point sources. Methodological Issues: Rare phenomenon requires preserving continuous spatial setting. Natural environmental heterogeneity causes spatial variation in intensity. Ambiguity in defining "cluster" must be avoided. 2. The Poisson Point Process Model 2.1. Model Formulation Data: Set of events $x_i$ in a planar region $A$, representing locations of phenomena. Inhomogeneous spatial Poisson point process with intensity function $\lambda(x)$. Multiplicative decomposition of intensity: $$ \lambda(x) = \rho \lambda_0(x) f(x - x_0; \theta) $$ $\rho$: Overall number of events per unit area. $\lambda_0(x)$: Spatial variation in intensity without association to $x_0$. $f(x - x_0; \theta)$: Change in intensity relative to prespecified point $x_0$. $\theta$: Parameters describing the effect of $x_0$. $f(x; \theta) = 1$ for all $x$ when $\theta = 0$, representing no association. 2.2. Specification of $\lambda_0(x)$ Uses collateral data $\{y_j \in A: j = 1, \dots, m\}$ from a related phenomenon (e.g., a more common cancer not associated with $x_0$). Estimated via a kernel estimator: $$ \lambda_0(x; h) = h^{-2} \sum_{j=1}^m G\{(x - y_j)/h\} $$ $G(\cdot)$: Radially symmetric bivariate probability density function (e.g., Gaussian kernel). $h$: Tuning constant (bandwidth) for smoothing. Gaussian kernel: $G(x) = (2\pi)^{-1} \exp\{-\frac{1}{2}x'x\}$. 2.3. Specification of $f(x; \theta)$ Describes the "raised incidence" around $x_0$. Typically unimodal with maximum at $x_0$ and decaying towards a constant value away from $x_0$. A common parametric form for $\theta = (\alpha, \beta)$: $$ f(x; \alpha, \beta) = 1 + \alpha \exp\{-\beta g(x'x)\} $$ $\alpha \ge 0, \beta \ge 0$. $g(\cdot)$: Monotone non-decreasing with $g(0) = 0$. For practical purposes, a common form is $f(x; \alpha, \beta) = 1 + \alpha \exp\{-\beta x'x\}$. 3. Maximum Likelihood Estimation 3.1. Parameter Estimation Given $\theta$, the maximum likelihood estimate of $\rho$ is: $$ \hat{\rho}(\theta) = n \left( \int_A \lambda_0(x) f(x - x_0; \theta) dx \right)^{-1} $$ Profile log-likelihood for $\theta$: $$ L(\theta) = \sum_{i=1}^n \log \{f(x_i - x_0; \theta)\} - n \log \left( \int_A \lambda_0(x) f(x - x_0; \theta) dx \right) $$ For the Gaussian kernel $\lambda_0(x)$ and $f(x; \alpha, \beta) = 1 + \alpha \exp\{-\beta x'x\}$: $$ L(\alpha, \beta) = \sum_{i=1}^n \log\{1 + \alpha \exp(-\beta e_i)\} - n \log\{w(\alpha, \beta)\} $$ where $e_i$ is the squared distance from $x_i$ to $x_0$, and $w(\alpha, \beta)$ is a term involving kernel sums. Derivatives of $L(\alpha, \beta)$ with respect to $\alpha$ and $\beta$ are used for optimization. Hypothesis Test: $H_0: \alpha = \beta = 0$ (no association) vs. $H_1: \alpha > 0$ or $\beta > 0$. Likelihood ratio test statistic: $D = 2\{L(\hat{\alpha}, \hat{\beta}) - L(0,0)\}$. Compare $D$ to critical values of $\chi^2$ distribution (or Monte Carlo simulation for robustness). 3.2. Goodness-of-fit Assessment Transformations of data points $t_i = \int_0^{r_i} \lambda(x) dx$ for ordered distances $r_i$ from $x_0$. $t_i$ should follow a homogeneous 1D Poisson process. Quantities $u_i = t_i/t_n$ should be order statistics from uniform distribution on $(0,1)$. Intervals $\delta_i = t_{i+1} - t_i$ should be independent samples from an exponential distribution. Assess uniformity of $u_i$ (e.g., Kolmogorov-Smirnov test) and exponentiality of $\delta_i$ (e.g., Q-Q plot). 4. Application Example (Chorley-Ribble Cancer Data) Data: $x_0$: Location of a disused industrial incinerator. $x_i$: Locations of larynx cancer cases ($n=58$). $y_j$: Locations of lung cancer cases ($m=978$, used for $\lambda_0(x)$). Kernel Smoothing for $\lambda_0(x)$: Optimal bandwidth $h$ determined by minimizing MSE. Example: $h=0.15$ km for Gaussian kernel. Maximum Likelihood Estimates: $(\hat{\alpha}, \hat{\beta}) = (23.67, 0.91)$ with standard errors. Maximized log-likelihood $L(\hat{\alpha}, \hat{\beta}) = -394.59$. Log-likelihood for null hypothesis $L(0,0) = -399.36$. Likelihood ratio statistic $D = 9.54$. P-value $P\{\chi^2 > 9.54\} = 0.008$, indicating strong evidence of association. Goodness-of-Fit: Q-Q plots of simulated $D$ values against $\chi^2$ distribution show better fit with 1 degree of freedom (suggesting conservative 2-DOF test). Monte Carlo assessment formally rejects null hypothesis at 1% significance level. Empirical distribution function of $u_i$ and Q-Q plot for $\delta_i$ show no significant departure from uniformity or exponentiality. 5. Discussion and Extensions Benefits of Approach: Avoids arbitrary aggregation into discrete regions. Flexible description of natural spatial variation. Focuses on quantitative description of variation around a prespecified point, rather than artificial cluster definitions. Possible Extensions: Widen class of functions for $f(\cdot)$ (e.g., directional dependence, multiple sources). Alternative collateral information for $\lambda_0(\cdot)$ (e.g., demographic data, geostatistical spatial prediction). Log-linear regression model for $\lambda_0(y) = \exp\{\sum_{k=1}^K \beta_k Z_k(y)\}$. Conditioning on events $x_i$ and $y_j$ for permutation tests (e.g., sum of squared distances to $x_0$). Limitations: Assumes conditional independence of $x$-events given $\lambda_0(\cdot)$ and $f(\cdot)$. Sensitivity to tight clusters: deleting events from a cluster can significantly alter results. Problem of retrospective hypothesis formulation based on data inspection.