What is the success event in this scenario?

purchase. Positive signal paths end in this event; negative paths do not. Entity type is account.

How much data does this scenario generate?

Default scale is 500 accounts, 4 users per account on average, 10 sessions per user, spread over 30 days. Override any value in the config YAML.

posthog_web posthog_mvp_v1

PostHog Web (baseline)

The minimum-viable PostHog scenario. A generic event stream with a two-step conversion funnel and corresponding negative path. Good starting point for signal-detection benchmarking before moving to business-model-specific variants.

Value metric

Generic SaaS activation (baseline)

Success event

purchase

entity_type: account

Scale

500 accounts
4 users/account (mean)
10 sessions/user (mean)
30 days of history

Research metrics proxied

— Baseline conversion rate
— Funnel drop-off

Signal paths

Positive paths end in purchase. Negative paths do not. Every generated event_id belonging to a path is recorded in ground_truth.json.

Positive signals (2)

billing_then_purchase ×80

page_billing → purchase

cohorts: high_intent, medium_intent, power_user

invite_then_api_key_then_purchase ×35

invite_teammate → api_key_created → purchase

cohorts: high_intent, power_user

Negative signals (2)

billing_dropout ×70

page_billing → page_pricing

cohorts: low_intent, lurker, medium_intent

pricing_only_stall ×110

page_pricing → page_pricing

cohorts: low_intent, lurker, noisy_bot_like

Generate this dataset

Config file: configs/posthog_mvp.yaml

Quickstart

# Dockerized Postgres (recommended for inspection)
docker compose up -d

uv run dryfit \
  -c configs/posthog_mvp.yaml \
  --dsn postgresql://dryfit_writer:dryfit_writer@127.0.0.1:54329/dryfit \
  --print-summary

# Or local Postgres
./scripts/generate-local -c configs/posthog_mvp.yaml --print-summary

Full setup instructions are in the repo's README — including local Postgres, Grafana inspection, and dataset restore.

Noise parameters

DryFit injects realistic noise on top of the generated signal paths. These probabilities are per-event. Noise never touches rows referenced by ground_truth.json — your scoring logic can trust the truth file is exact.

missing event probability

6.0%

duplicate event probability

2.0%

out of order probability

3.0%

null property probability

5.0%

anonymous actor probability

2.0%

weird property probability

1.0%

Browse all scenarios →

Benchmark your detector against PostHog Web (baseline)

Clone the repo, run the config, check your agent's output against ground_truth.json.

View on GitHub

PostHog Web (baseline)

Value metric

Success event

Scale

Research metrics proxied

Signal paths

Positive signals (2)

Negative signals (2)

Generate this dataset

Noise parameters

Other scenarios

Combined coverage (all models)

Contact / record-based SaaS

Credits / token-based

Event-volume SaaS

Benchmark your detector against PostHog Web (baseline)