What is the success event in this scenario?

compute_hours_used. Positive signal paths end in this event; negative paths do not. Entity type is account.

How much data does this scenario generate?

Default scale is 320 accounts, 5 users per account on average, 9 sessions per user, spread over 30 days. Override any value in the config YAML.

posthog_hybrid_seat_usage posthog_hybrid_seat_usage_mvp_v1

Hybrid (seat + usage)

Hybrid pricing that combines per-seat licensing with metered overage. Positive signals chain through invite, signup, activation, and compute usage — exercising the longest legitimate path in any scenario.

Value metric

Seats plus usage overage

Success event

compute_hours_used

entity_type: account

Scale

320 accounts
5 users/account (mean)
9 sessions/user (mean)
30 days of history

Research metrics proxied

— Seat growth plus usage acceleration
— Overage frequency

Signal paths

Positive paths end in compute_hours_used. Negative paths do not. Every generated event_id belonging to a path is recorded in ground_truth.json.

Positive signals (2)

seat_growth_then_usage_growth ×70

invite_sent → user_signed_up → seat_activated → api_request → compute_hours_used

cohorts: high_intent, medium_intent, power_user

activated_usage_completion ×45

seat_activated → api_request → job_completed → compute_hours_used

cohorts: high_intent, power_user

Negative signals (2)

seats_without_usage ×60

invite_sent → user_signed_up → seat_activated

cohorts: medium_intent, low_intent, lurker

request_job_no_overage ×55

api_request → job_completed

cohorts: low_intent, lurker, noisy_bot_like

Generate this dataset

Config file: configs/posthog_hybrid_seat_usage_mvp.yaml

Quickstart

# Dockerized Postgres (recommended for inspection)
docker compose up -d

uv run dryfit \
  -c configs/posthog_hybrid_seat_usage_mvp.yaml \
  --dsn postgresql://dryfit_writer:dryfit_writer@127.0.0.1:54329/dryfit \
  --print-summary

# Or local Postgres
./scripts/generate-local -c configs/posthog_hybrid_seat_usage_mvp.yaml --print-summary

Full setup instructions are in the repo's README — including local Postgres, Grafana inspection, and dataset restore.

Noise parameters

DryFit injects realistic noise on top of the generated signal paths. These probabilities are per-event. Noise never touches rows referenced by ground_truth.json — your scoring logic can trust the truth file is exact.

missing event probability

5.0%

duplicate event probability

2.0%

out of order probability

3.0%

null property probability

3.0%

anonymous actor probability

1.0%

weird property probability

2.0%

Browse all scenarios →

Benchmark your detector against Hybrid (seat + usage)

Clone the repo, run the config, check your agent's output against ground_truth.json.

View on GitHub

Hybrid (seat + usage)

Value metric

Success event

Scale

Research metrics proxied

Signal paths

Positive signals (2)

Negative signals (2)

Generate this dataset

Noise parameters

Other scenarios

Combined coverage (all models)

Contact / record-based SaaS

Credits / token-based

Event-volume SaaS

Benchmark your detector against Hybrid (seat + usage)