Programmatic access to the OpenCEM dataset, simulator components, and their Python APIs.
An immutable, high-resolution timestamp used throughout the simulator. The Clock dataclass is the single source of truth for simulation time.
opencem.clock
A @dataclass(frozen=True) with two fields: ticks: int (the current position in time) and RES: int = 10**9 (resolution — ticks per second, default nanosecond). All arithmetic is integer-based for determinism.
from opencem.clock import Clock
# Create from a date string (uses pandas.Timestamp internally)
clock = Clock.from_string("2025-12-26T08:00:00")
# Create from current wall-clock time
clock = Clock.now()
# Create from epoch seconds
clock = Clock.from_seconds(1735200000.0)
# Advance time (returns a new Clock since it's frozen)
next_clock = clock.advance(step_ticks=15_000_000_000) # +15 seconds
next_clock = clock.advance_seconds(15.0) # same thing
# Compute elapsed hours between two clocks
hours = Clock.difference_hours(clock, next_clock)
# Convert to other representations
clock.to_seconds() # float epoch seconds
clock.to_numpy_datetime64() # np.datetime64
str(clock) # human-readable ISO string| Method | Returns | Description |
|---|---|---|
Clock.now(res=10**9) | Clock | Current wall-clock time |
Clock.from_string(s, res) | Clock | Parse any date/time string via pd.Timestamp |
Clock.from_seconds(s, *, res) | Clock | Create from epoch seconds |
.advance(step_ticks) | Clock | Return a new Clock that is step_ticks ahead |
.advance_seconds(seconds) | Clock | Convenience: advance by float seconds |
.align(ticks_per_step) | Clock | Snap to the nearest multiple of ticks_per_step |
Clock.difference_hours(a, b) | float | Hours elapsed from a to b |
.to_seconds() | float | Convert to epoch seconds |
.to_numpy_datetime64() | np.datetime64 | Convert to numpy datetime |
The OpenCEM Simulator uses a modular, object-oriented architecture where each physical component is represented by an abstract base class (ABC). All components share a common step() interface.
Abstract base class establishing common conventions for the simulator loop. All components implement a step() method that advances by step_ticks (integer ticks in Clock resolution). Also provides an optional context() hook and id/specification properties.
class SystemComponent(ABC):
def step(self, step_ticks: int, *args, **kwargs) -> Any:
"""Advance the component by step_ticks (in Clock ticks).
Returns component-specific result dataclass."""
...
def context(self, *args, **kwargs) -> None:
"""Optional hook for injecting context data."""
return NoneModels any independent DC power source such as a solar panel array. Returns output voltage (V), current (A), and power (W).
# PowerSource.step returns a PowerSourceStepResult
result = pv_array.step(step_ticks=1)
print(f"Voltage: {result.voltage}V")
print(f"Current: {result.current}A")
print(f"Power: {result.power}W")Models a DC battery connected to and controlled by the inverter. Takes an optional BatteryStepInput specifying mode (BatteryStepMode.CHARGE / DISCHARGE / IDLE) and current_a.
from opencem.interfaces import BatteryStepInput, BatteryStepMode
result = battery.step(
step_ticks=15_000_000_000,
battery_input=BatteryStepInput(
mode=BatteryStepMode.CHARGE,
current_a=15.0
)
)
print(f"SOC: {result.soc:.2%}") # [0, 1]
print(f"Discharge energy: {result.discharge_energy_j} J")Models a one-way AC grid connection. Can import power but not export. Takes requested apparent power (VA) and active power (W).
result = grid.step(
step_ticks=1,
requested_va=500,
requested_w=480
)
print(f"Delivered: {result.delivered_w}W")The central component connecting PowerSource, Battery, Grid, and Load. Implements power prioritization: PV first, then Battery, then Grid as fallback.
# Inverter orchestrates all components
result = inverter.step(
step_ticks=1,
pv_voltage=pv_result.voltage,
context=ctx_result
)
# Returns: BatteryStepInput, GridStepInput, delivered powerModels an AC load supplied at 230V / 50Hz. Returns requested active power (W) and apparent power (VA).
The unique component that processes time-stamped natural language context records (user announcements, system logs, event data) and injects them into the simulation loop.
# Context returns future ContextRecords
records = context.step(step_ticks=1)
for r in records:
print(f"[{r.recorded}] {r.value}")
# e.g., "Tomorrow I will run a CPU-intensive,
# multi-core numeric robustness test for a day"Instantiated with PowerSource, Battery, Load, Grid, and Inverter (plus optional Clock and Context). Calls each component's step() method in order and returns a SimulatorStepResult containing all individual results plus step and cumulative aggregate statistics (energy generated, consumed, max values, etc.).
from opencem.simulator import Simulator
from opencem.dataset import *
from opencem.clock import Clock
import sqlite3
# Connect to the SQLite dataset
conn = sqlite3.connect("opencem_dataset.db")
clock = Clock.from_string("2025-12-26")
STEP = 15_000_000_000 # 15 seconds in nanosecond ticks
# Create dataset models (replay real data)
pv = PowerSourceDataset(clock, inverter_id=1, database=conn)
batt = BatteryDataset(clock, inverter_id=1, database=conn)
load = LoadDataset(clock, inverter_id=1, database=conn)
grid = GridDataset(clock, inverter_id=1, database=conn)
inv = InverterDataset(clock, inverter_id=1, database=conn)
ctx = ContextDataset(clock, horizon_ticks=STEP*240, database=conn)
# Build and run simulator (24 hours at 15-second steps)
sim = Simulator(pv, batt, load, grid, inv, clock, context=ctx)
for _ in range(5760):
result = sim.step(step_ticks=STEP)
print(f"SOC={result.battery.soc:.1%} PV={result.power_source.power_w:.0f}W")The dataset is provided as a SQLite database. The opencem.dataset package wraps it with convenient Python classes.
import sqlite3
import pandas as pd
conn = sqlite3.connect("opencem_dataset.db")
# Query electrical measurements
df = pd.read_sql_query("""
SELECT read_ts, inverter, battsoc, pv1power, outsumw, gridpowerw_a
FROM analog_measurements
WHERE inverter = 1
ORDER BY read_ts DESC
LIMIT 1000
""", conn)
# Query context records
ctx = pd.read_sql_query("""
SELECT recorded, start, end, value
FROM context
ORDER BY recorded DESC
""", conn)
print(ctx.head())Each physical component has a corresponding dataset class that provides interpolated readings from the real data.
| Class | Interface | Constructor Args | Step Returns |
|---|---|---|---|
BatteryDataset | Battery | clock, inverter_id, database | BatteryStepResult (voltage_v, current_a, soc, discharge_energy_j) |
PowerSourceDataset | PowerSource | clock, inverter_id, database | PowerSourceStepResult (voltage_v, current_a, power_w) |
GridDataset | Grid | clock, inverter_id, database | GridStepResult (power_delivered_apparent_va, power_delivered_active_w) |
LoadDataset | Load | clock, inverter_id, database | LoadStepResult (current_a, voltage_v, power_apparent_va, power_active_w) |
InverterDataset | Inverter | clock, inverter_id, database | InverterStepResult (next_battery_input, next_grid_input, generator_power_drawn_w) |
ContextDataset | Context | clock, horizon_ticks, database | List[ContextRecord] (recorded_at, start, end, payload) |
All dataset classes interpolate between adjacent database rows using linear interpolation (interpolate_value()). Additionally, BlockSampledDataset and BlockSampledContext wrappers provide random block-resampling for Monte Carlo experiments.
Beyond dataset replay, OpenCEM provides parametric models for testing alternative control strategies.
from opencem.linear import BatteryLinear
from opencem.clock import Clock
clock = Clock.from_string("2025-12-26")
battery = BatteryLinear(
clock=clock,
initial_soc=0.8,
capacity_j=51.2 * 200 * 3600, # 200Ah * 51.2V in Joules
nominal_voltage_v=51.2,
charge_efficiency=0.95,
discharge_efficiency=0.93
)Simple inverter model matching the default inverter configuration. Prioritizes PV generation, then battery reserves, then grid import as a last resort.
Extends Grid with a price schedule for cost optimization. Each step returns a cost and violation field, enabling RL training or MPC control optimization.