Skip to content

Location-Allocation (P-Median)

The location-allocation problem answers the question: given N potential sites, where should I place P facilities to minimize total customer travel? It’s the mathematical core of network planning for retail chains, hospitals, fire stations, and any service with physical locations.

Minimize Z = Σᵢ Σⱼ (wᵢ × dᵢⱼ × xᵢⱼ)

Subject to:

  • Each demand point i is assigned to exactly one facility j
  • Exactly P facilities are open
  • A demand point can only be assigned to an open facility

Where:

  • wᵢ = demand (population, spending power) at point i
  • dᵢⱼ = distance/travel time from demand point i to facility j
  • xᵢⱼ = 1 if point i is served by facility j, 0 otherwise
  • P = number of facilities to locate

“Facility location problems involve flows… the p-median problem minimizes total weighted distance.” — de Smith, Goodchild & Longley, Geospatial Analysis, §7.4

DatasetWhat We UseLink
Census PopulationDemand points (population per zone)View →
MTR Stations & LinesNetwork distance matrix between demand and supply pointsView →
FEHD Restaurant LicencesExisting facility locations (competitor network)View →
Rental Indices (RVD)Facility cost constraintsView →
ModelObjectiveUse Case
P-medianMinimize average distanceRetail chains, post offices
P-centreMinimize maximum distanceEmergency services, hospitals
CoverageMaximize population within threshold distanceAmbulance stations, cell towers
Maximal coverageCover max demand with budget constraintAny service with fixed budget
  1. Define demand points — census blocks, residential areas, each with a population weight
  2. Define candidate sites — all possible locations you’d consider opening
  3. Calculate distance matrix — travel time or distance from every demand point to every candidate
  4. Run optimization — find the P sites that minimize total weighted distance
  5. Assign customers — each demand point goes to its nearest open facility

de Smith et al. describe three practical approaches:

Start with no facilities. Add the one that reduces total distance the most. Repeat P times. Fast but can get stuck in local optima.

Start with P random facilities. Try swapping each one with a non-selected candidate. Accept swaps that reduce total distance. Repeat until no improvement. Better solutions, slower convergence.

Relax the assignment constraint, solve a simpler problem, then iteratively tighten. Provides bounds on how close your solution is to optimal.

For evaluating one specific site, location-allocation answers a critical question:

“Is this site in the optimal set?”

If you run p-median for P=5 branches in a district, and your target site doesn’t appear in any solution, that’s a strong signal. If it consistently appears, that validates the location.

Strengths:

  • Mathematically rigorous optimization
  • Considers the entire network simultaneously (not just one catchment)
  • Answers “where” not just “how good”
  • Scales to regional and national planning

Limitations:

  • Requires complete distance matrix (expensive to compute)
  • NP-hard — heuristics needed for real-world size
  • Assumes customers always go to nearest facility (ignores brand preference, quality)
  • Static — doesn’t account for time-varying demand

The p-median model pairs naturally with the Huff Model and Spatial Interaction Model:

  1. P-median → identifies where to locate
  2. Huff/SIM → predicts how much revenue each location captures
  3. Agent simulation → tests what happens when consumers make complex choices

Demand point generation: 50 demand points distributed around the district centroid within a 2km radius using distributeAround(). These represent the spatial distribution of potential customers.

Demand coverage: Fraction of district area covered by an 800m walk radius:

demandCoverage = round((π × 0.8²) / totalDistrictArea × 100)

Where totalDistrictArea = totalPop / max(1, density) in km². Capped at 100%.

Greedy P-median search — tests candidate locations in order:

  1. District centroid
  2. Midpoint between current lat/lng and centroid
  3. 10 demand points directly (first 10 of the 50 generated)

Any candidate within 100m of an existing competitor is skipped. The candidate minimizing total summed distance to all 50 demand points is selected as “optimal”.

Optimality score: max(10, min(100, round(100 × exp(-distFromOptimal / 800))))

  • At optimal point → 100
  • 800m from optimal → ~37
  • Range hard-capped at 10–100

Weighted average distance: Mean haversine distance from all 50 demand points to the evaluated location (in metres).

DateChangeWhy
2026-03-25Optimality score range clamped to 10–100 (max(10, min(100, ...)))Remote locations with distFromOptimal >4km were scoring 0 or negative; minimum 10 preserves interpretability
2026-03-25Demand coverage formula uses π×0.8²/districtArea (not radius buffer count)Counting demand points in buffer was noisy for small districts; area fraction is deterministic
2026-03-24Initial implementation

📖 de Smith, M.J., Goodchild, M.F. & Longley, P.A. (2024). Geospatial Analysis: A Comprehensive Guide. §7.4: Facility Location — P-Median and P-Centre Problems.