Reliability¶
simweave.reliability provides an availability and maintainability simulation
framework built on top of the existing discrete-event and supply-chain
primitives. It is designed for scenarios where the operational availability
of a fleet of assets -- taxis, military vehicles, aircraft, industrial machines
-- is a key performance indicator driven by:
- failure rates of individual subsystems,
- spare parts holdings in one or more warehouses,
- repair-bay capacity and technician resources, and
- the financial cost of new buys vs. repairs.
The stacked area chart below shows a simulated year for an 8-vehicle taxi
fleet. Green = operational, amber = in repair, red = awaiting parts.
Calendar months are displayed by pairing the recorder with a
SimTimeAxis.
Concepts¶
Subsystem¶
A subsystem is any replaceable or repairable component fitted to an asset
(engine, gearbox, tyre set, avionics module, etc.). Each subsystem is
described by a SubsystemSpec and can
be either:
| Type | On failure |
|---|---|
| Consumable | Failed unit discarded; new unit drawn from warehouse stock |
| Repairable | Failed unit sent to a RepairCentre for repair |
| Beyond economic repair (BER) | A fraction of repairable failures that are uneconomical to fix → treated as new buy |
Failure events follow an exponential (memoryless) distribution, the
standard model for electronic and mechanical components in the absence of
wear-out. Both time-based (failure_rate in failures/day) and
cycle-based (failure_rate_per_cycle in failures/km, sortie, etc.) failure
rates can be active simultaneously.
ReliableEntity¶
A ReliableEntity inherits from
Entity and owns a list of
SubsystemSpec objects. On every
simulation tick it:
- Checks each UP subsystem for a random failure event.
- For newly failed subsystems, draws a spare part from the linked
Warehouse. - Retries AWAITING_PART subsystems every tick until stock is replenished.
- Submits a
RepairJobto theRepairCentreonce parts are in hand. - Tracks cumulative operational time, downtime, and costs.
An entity is operational only when all of its subsystems are UP.
RepairCentre¶
RepairCentre is a subclass of
Service. It inherits all of Service's
queuing and multi-channel machinery. Model a repair team under the
operator's employment by passing a
ResourcePool of technicians. For a
third-party maintenance contract simply tune capacity and buffer_size to
reflect the contracted throughput.
On each job completion the RepairCentre:
- Returns repaired units to warehouse stock (repairable, non-BER cases).
- Records cost and counters (
total_newbuys,total_repairs,total_cost). - Calls back into the owning
ReliableEntityto restore the subsystem to UP.
Fleet and FleetAvailabilityRecorder¶
Fleet is a thin wrapper around a list of
ReliableEntity instances with aggregate properties:
| Property | Description |
|---|---|
operational_count |
Entities fully operational right now |
operational_availability |
Fraction of fleet operational right now |
mean_availability |
Mean of each entity's time-based empirical Ao |
total_cost |
Sum of new-buy + repair costs across the fleet |
FleetAvailabilityRecorder
is registered with the environment and snapshots fleet state each tick.
Its times, operational, in_repair, and awaiting_part lists are
fed directly to
plot_fleet_availability.
Sensitivity Analysis¶
sensitivity_sweep varies one or
two scalar parameters of a scenario builder function across a grid and
collects a scalar metric (e.g. Ao) from each cell. Monte Carlo averaging is
supported via the n_runs argument.
from simweave.reliability import sensitivity_sweep
def build(n_bays, stock_mult, seed):
# ... build and run scenario ...
return operational_availability # scalar
result = sensitivity_sweep(
build,
param1_name="repair_bays",
param1_values=[1, 2, 3, 4],
param2_name="stock_multiplier",
param2_values=[0.5, 1.0, 1.5, 2.0],
metric_name="Ao",
n_runs=30,
)
The SweepResult can be passed to
plot_sensitivity_surface for a 3-D
surface, heatmap, or grouped bar chart.
Quick start¶
import numpy as np
import simweave as sw
# 1. Describe subsystems
specs = [
sw.SubsystemSpec(
name="engine",
failure_rate=1/120, # MTBF = 120 days
sku_index=0,
consumable=False,
beyond_economic_repair_prc=0.10,
repair_time=5.0,
unit_cost=8_000.0,
repair_cost=2_500.0,
),
sw.SubsystemSpec(
name="tyres",
failure_rate=1/45,
sku_index=1,
consumable=True,
repair_time=0.5,
unit_cost=400.0,
),
]
# 2. Build warehouse
inv = sw.InventoryItems(
part_names=["engine", "tyres"],
unit_cost=[8_000.0, 400.0],
stock_level=[3.0, 10.0],
batchsize=[2.0, 4.0],
reorder_points=[1.0, 2.0],
repairable_prc=[0.90, 0.0],
repair_times=[5.0, 0.0],
newbuy_leadtimes=[14.0, 3.0],
)
warehouse = sw.Warehouse(inventory=inv, name="depot")
# 3. Build repair centre (2 bays, 3 technicians)
technicians = sw.ResourcePool(maxlen=3, name="technicians")
for i in range(3):
technicians.deposit(sw.Resource(name=f"tech_{i}"))
repair_centre = sw.RepairCentre(capacity=2, resources=technicians)
# 4. Build fleet
rng = np.random.default_rng(42)
vehicles = [
sw.ReliableEntity(
subsystems=specs,
warehouse=warehouse,
repair_centre=repair_centre,
name=f"taxi_{i:02d}",
rng=np.random.default_rng(rng.integers(0, 2**32)),
)
for i in range(10)
]
fleet = sw.Fleet(vehicles, name="taxi_fleet")
recorder = sw.FleetAvailabilityRecorder(fleet)
# 5. Run
env = sw.SimEnvironment(dt=1.0, end=365.0)
env.register(warehouse)
env.register(repair_centre)
for v in vehicles:
env.register(v)
env.register(recorder)
env.run(until=365.0)
# 6. Summarise
print(f"Operational availability: {recorder.mean_operational_availability:.3f}")
print(f"Total fleet cost: £{fleet.total_cost:,.0f}")
# 7. Plot
fig = sw.plot_fleet_availability(recorder, title="Taxi Fleet Availability")
fig.show()
API reference¶
SubsystemSpec
dataclass
¶
SubsystemSpec(name: str, failure_rate: float, sku_index: int, consumable: bool = True, beyond_economic_repair_prc: float = 0.0, repair_time: float = 1.0, unit_cost: float = 0.0, repair_cost: float = 0.0, failure_rate_per_cycle: float = 0.0)
Immutable description of one subsystem fitted to a ReliableEntity.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Human-readable label (e.g. |
required |
failure_rate
|
float
|
Time-based failure rate |
required |
sku_index
|
int
|
Index into the associated :class: |
required |
consumable
|
bool
|
If |
True
|
beyond_economic_repair_prc
|
float
|
Fraction of failures on a repairable subsystem that are beyond
economic repair and therefore require a new buy instead. Ignored
when |
0.0
|
repair_time
|
float
|
Nominal repair / fit time in simulation time units. This becomes the
|
1.0
|
unit_cost
|
float
|
Cost charged per new unit purchased (new buy or BER replacement). |
0.0
|
repair_cost
|
float
|
Cost charged per repair (non-BER repairable failure). |
0.0
|
failure_rate_per_cycle
|
float
|
Cycle-based failure rate in failures per operational cycle. Set to
|
0.0
|
SubsystemState
¶
SubsystemStatus
dataclass
¶
SubsystemStatus(spec: SubsystemSpec, state: SubsystemState = UP, time_in_state: float = 0.0, total_failures: int = 0, total_downtime: float = 0.0, cost_newbuy: float = 0.0, cost_repair: float = 0.0)
Live state of one subsystem on a specific entity.
Created automatically by :class:~simweave.reliability.entity.ReliableEntity
for each :class:SubsystemSpec it is initialised with.
ReliableEntity
¶
ReliableEntity(subsystems: Sequence[SubsystemSpec], warehouse: 'Warehouse', repair_centre: 'RepairCentre | None' = None, name: str | None = None, rng: Generator | None = None)
Bases: Entity
An entity composed of subsystems that can fail and require repair.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
subsystems
|
Sequence[SubsystemSpec]
|
One :class: |
required |
warehouse
|
'Warehouse'
|
Parts warehouse. When a subsystem fails, one unit of its SKU is consumed from here. When a repairable unit is returned to service, one unit is added back. |
required |
repair_centre
|
'RepairCentre | None'
|
Optional :class: |
None
|
name
|
str | None
|
Display name. |
None
|
rng
|
Generator | None
|
Numpy random generator. Defaults to |
None
|
Attributes:
| Name | Type | Description |
|---|---|---|
subsystems |
list[SubsystemStatus]
|
Live state of each fitted subsystem. |
operational_cycles |
float
|
Cumulative operating cycles. Increment this in your scenario script (e.g. each km driven, each sortie flown) to activate cycle-based failure rates. |
total_operational_time |
float
|
Simulation time spent fully operational. |
total_downtime |
float
|
Simulation time spent with at least one subsystem not UP. |
cost_newbuy |
float
|
Cumulative spend on new part purchases. |
cost_repair |
float
|
Cumulative spend on repairs. |
RepairJob
¶
RepairJob(owner: 'ReliableEntity', subsystem_idx: int, is_new_buy: bool, return_to_stock: bool, repair_time: float, cost: float, name: str | None = None)
Bases: Entity
A work item representing a repair or new-unit-fit operation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
owner
|
'ReliableEntity'
|
The :class: |
required |
subsystem_idx
|
int
|
Position of the failed subsystem in |
required |
is_new_buy
|
bool
|
|
required |
return_to_stock
|
bool
|
|
required |
repair_time
|
float
|
How long the job takes at the repair centre (simulation time units).
Stored in |
required |
cost
|
float
|
Financial cost charged to |
required |
RepairCentre
¶
RepairCentre(capacity: int = 1, buffer_size: int = 100, resources=None, rng=None, name: str | None = None)
Bases: Service
A repair facility; a :class:~simweave.discrete.services.Service whose
completions restore failed subsystems on
:class:~simweave.reliability.entity.ReliableEntity instances.
The centre accepts :class:RepairJob items in its queue. On completion:
- If
job.return_to_stockthe repaired part is returned to the owning entity's warehouse (incrementing stock by one unit). - The owning entity's subsystem is transitioned back to UP.
- Cost and counter metrics on this centre are updated.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
capacity
|
int
|
Number of parallel work channels (repair bays / technicians when no explicit resource pool is used). |
1
|
buffer_size
|
int
|
Maximum number of jobs that can wait in the pre-repair queue. |
100
|
resources
|
Optional :class: |
None
|
|
rng
|
Random number generator forwarded to the parent |
None
|
|
name
|
str | None
|
Display name. |
None
|
Fleet
¶
Fleet(entities: Sequence[ReliableEntity], name: str = 'fleet')
A collection of :class:~simweave.reliability.entity.ReliableEntity
instances with aggregate operational metrics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
entities
|
Sequence[ReliableEntity]
|
The vehicles / platforms that make up the fleet. |
required |
name
|
str
|
Display name used in plot titles. |
'fleet'
|
operational_count
property
¶
Number of entities that are fully operational right now.
operational_availability
property
¶
Instantaneous operational availability (0–1).
mean_availability
property
¶
Mean of each entity's time-based empirical availability.
status_counts
¶
Classify every entity into one of three broad states.
Returns:
| Type | Description |
|---|---|
dict with keys ``"operational"``, ``"in_repair"``, ``"awaiting_part"``.
|
|
An entity is *awaiting_part* if any subsystem is in that state.
|
|
An entity is *in_repair* if it has at least one subsystem IN_REPAIR
|
|
and none AWAITING_PART.
|
|
FleetAvailabilityRecorder
¶
FleetAvailabilityRecorder(fleet: Fleet)
Records fleet state at each simulation tick.
Register with the environment after all
:class:~simweave.reliability.entity.ReliableEntity instances so the
snapshot captures the state after each tick's failures and repairs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fleet
|
Fleet
|
The :class: |
required |
Attributes:
| Name | Type | Description |
|---|---|---|
times |
list[float]
|
Simulation clock value at each snapshot. |
operational |
list[int]
|
Count of operational entities at each snapshot. |
in_repair |
list[int]
|
Count of entities in repair (part available) at each snapshot. |
awaiting_part |
list[int]
|
Count of entities waiting for parts at each snapshot. |
mean_operational_availability
property
¶
Time-averaged fraction of the fleet that was operational.
SweepResult
dataclass
¶
SweepResult(param1_name: str, param1_values: ndarray, param2_name: str | None, param2_values: ndarray | None, metric_name: str, metric_mean: ndarray, metric_std: ndarray, n_runs: int = 1)
Result of a 1-D or 2-D sensitivity sweep.
Attributes:
| Name | Type | Description |
|---|---|---|
param1_name |
str
|
Name of the first swept parameter. |
param1_values |
ndarray
|
Array of values swept for parameter 1. |
param2_name |
str | None
|
Name of the second parameter, or |
param2_values |
ndarray | None
|
Array of values swept for parameter 2, or |
metric_name |
str
|
Label for the output metric (used in plot axis titles). |
metric_mean |
ndarray
|
Mean metric value. Shape |
metric_std |
ndarray
|
Standard deviation across MC replicates. All zeros when
|
n_runs |
int
|
Number of Monte Carlo replicates per grid point. |
sensitivity_sweep
¶
sensitivity_sweep(scenario_builder: Callable[..., float], param1_name: str, param1_values: Sequence[float], param2_name: str | None = None, param2_values: Sequence[float] | None = None, metric_name: str = 'metric', n_runs: int = 1, seed: int = 0, executor: str = 'serial', n_workers: int | None = None) -> SweepResult
Run a 1-D or 2-D parameter sensitivity sweep with optional MC averaging.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
scenario_builder
|
Callable[..., float]
|
Callable with signature |
required |
param1_name
|
str
|
Name of the first parameter (used in plot labels). |
required |
param1_values
|
Sequence[float]
|
Values to sweep for parameter 1. |
required |
param2_name
|
str | None
|
Name of the second parameter. |
None
|
param2_values
|
Sequence[float] | None
|
Values to sweep for parameter 2. Required when |
None
|
metric_name
|
str
|
Label for the output metric. |
'metric'
|
n_runs
|
int
|
Number of Monte Carlo replicates per grid point. Each replicate
receives a unique seed derived from the base |
1
|
seed
|
int
|
Base random seed. Replicate r at grid point i (or (i, j))
receives seed |
0
|
executor
|
str
|
|
'serial'
|
n_workers
|
int | None
|
Number of worker processes. |
None
|
Returns:
| Type | Description |
|---|---|
SweepResult
|
|