What does the Usage-based (metered) SaaS scenario generate?

Metered SaaS where revenue scales with consumption. Positive signals are completed jobs and compute cycles; negative signals are stalled usage. Each run produces a PostgreSQL events table plus ground_truth.json referencing the specific event_ids that make up every positive and negative signal path.

What is the success event in this scenario?

job_completed. Positive signal paths end in this event; negative paths do not. Entity type is account.

How much data does this scenario generate?

Default scale is 320 accounts, 4 users per account on average, 9 sessions per user, spread over 30 days. Override any value in the config YAML.

posthog_usage_based posthog_usage_based_mvp_v1

Usage-based (metered) SaaS

Metered SaaS where revenue scales with consumption. Positive signals are completed jobs and compute cycles; negative signals are stalled usage.

Value metric

API calls, compute hours, messages, requests

Success event

job_completed

entity_type: account

Scale

320 accounts
4 users/account (mean)
9 sessions/user (mean)
30 days of history

Research metrics proxied

— Usage velocity
— Quota consumption
— Usage acceleration

Signal paths

Positive paths end in job_completed. Negative paths do not. Every generated event_id belonging to a path is recorded in ground_truth.json.

Positive signals (2)

request_to_job_completion ×95

api_request → job_completed

cohorts: high_intent, medium_intent, power_user

message_compute_job ×45

message_sent → compute_hours_used → job_completed

cohorts: high_intent, power_user

Negative signals (2)

request_compute_stall ×70

api_request → compute_hours_used

cohorts: medium_intent, low_intent, lurker

message_only_repeat ×65

message_sent → message_sent

cohorts: low_intent, lurker, noisy_bot_like

Generate this dataset

Config file: configs/posthog_usage_based_mvp.yaml

Quickstart

# Dockerized Postgres (recommended for inspection)
docker compose up -d

uv run dryfit \
  -c configs/posthog_usage_based_mvp.yaml \
  --dsn postgresql://dryfit_writer:dryfit_writer@127.0.0.1:54329/dryfit \
  --print-summary

# Or local Postgres
./scripts/generate-local -c configs/posthog_usage_based_mvp.yaml --print-summary

Full setup instructions are in the repo's README — including local Postgres, Grafana inspection, and dataset restore.

Noise parameters

DryFit injects realistic noise on top of the generated signal paths. These probabilities are per-event. Noise never touches rows referenced by ground_truth.json — your scoring logic can trust the truth file is exact.

missing event probability

5.0%

duplicate event probability

2.0%

out of order probability

3.0%

null property probability

3.0%

anonymous actor probability

1.0%

weird property probability

1.0%

Browse all scenarios →

Benchmark your detector against Usage-based (metered) SaaS

Clone the repo, run the config, check your agent's output against ground_truth.json.

View on GitHub

Usage-based (metered) SaaS

Value metric

Success event

Scale

Research metrics proxied

Signal paths

Positive signals (2)

Negative signals (2)

Generate this dataset

Noise parameters

Other scenarios

Combined coverage (all models)

Contact / record-based SaaS

Credits / token-based

Event-volume SaaS

Benchmark your detector against Usage-based (metered) SaaS