Forecasts & Payday

Overview

The prod-insight-api Lambda exposes two related capabilities via its REST API:

  • Forecasts (GET /{user_id}/forecasts) — a combined view of a user’s recurring income, recurring expenses, and ritual expenses, assembled either from Pave-cached data or from real-time FloatMe-generated analysis depending on GrowthBook flags.

  • Payday prediction (called internally during float creation and by GET /{user_id}/insights/employment/payday) — predicts the user’s next payday date using up to three algorithms run in parallel and compared before returning a result.

Both capabilities depend on the same upstream data sources: Pave-cached insight entities in DynamoDB, employment records in RDS, and recent transactions from the Transactions Service.


Forecasts (GET /{user_id}/forecasts)

The forecasts endpoint assembles a Forecasts response from income sources, recurring expenses, and ritual expenses. Two data paths exist and are controlled by GrowthBook flags:

Path Flag(s) Behaviour

Pave path (default)

insight.pave.decom.enabled = false

Reads recurring, ritual, and income entities from prod-pave DynamoDB (written by the miner). Falls back to the FloatMe path if Pave data is unavailable.

FloatMe path (FM-generated)

insight.pave.decom.enabled = true OR insights.fm_forcasts.api.send_fm_generated = true for the user

Fetches the user’s last 90 days of transactions from the Transactions Service and runs the FloatMe recurring detection algorithm directly. Does not use Pave-cached data.

When insights.fm_forcasts.api.send_fm_generated = true for the user and insight.pave.decom.enabled = false, both forecasts are computed: the FloatMe result is saved to DynamoDB as a CASHFLOW_ANALYTICS data-capture record and the FloatMe response is returned to the caller. When neither flag is set, only the Pave forecast is computed and returned.

FloatMe Recurring Detection

The FloatMe forecasts algorithm analyses raw transaction history directly, without Pave:

  1. Fetches up to 90 days of transactions from the Transactions Service.

  2. Groups transactions by account, then by transaction name.

  3. Within each group, separates credits (income) from debits (expenses) before analysis — this prevents income transactions with the same merchant name as an expense from being mixed.

  4. Applies a Jaro-Winkler similarity threshold (≥ 0.90) to group similarly-named transactions together.

  5. For expense groups: confirms recurrence by checking that transactions in the group span multiple calendar dates; builds a RecurringExpense entity if recurring.

  6. For income groups: confirms recurrence similarly and validates against Plaid category data.

Data-Capture Writes

When the FloatMe path runs alongside the Pave path, the result is written to the fmdatacapture DynamoDB table with:

Field Value

Event name

CASHFLOW_ANALYTICS

Sort key

CASHFLOW_ANALYTICS#{today’s date (UTC)}

TTL

2 days (172,800 seconds)

This is a shared cross-service data-capture table — the Insight Service is a producer, not the owner.


Payday Prediction

When It Runs

Payday prediction (EvaluateNextPayday) is called in two contexts:

  1. During float creation — called by the Float Service with a non-empty loan_id. In this case, all three prediction algorithms run and results are saved to DynamoDB + data-capture.

  2. Direct API queryGET /{user_id}/insights/employment/payday calls it with an empty loan_id. In this case, results are computed but not persisted (no DynamoDB write, no data-capture).

Three Prediction Algorithms

All three run when loan_id is provided and insight.payday_prediction.v2.enabled = true for the user; when that flag is false (or not enabled), only Pave and FM Legacy run even if loan_id is provided. Only Pave and FM Legacy run when loan_id is empty.

Algorithm Key Description

Pave

(always runs)

Fetches the user’s RecurringIncomeSources from Pave via the API. Tries to find a source whose name matches the user’s employment record (Jaro-Winkler ≥ 0.90); falls back to any valid income source if no match is found. Generates next payday and collection dates from the matched source’s next_date and normalized_frequency.

FloatMe Legacy (FM)

(always runs)

Uses the user’s RDS employment record (pay_frequency, start_date, employer_name) to calculate the next payday deterministically. Computes instant (24h buffer), standard (96h buffer), and extended payback dates.

FloatMe V2

insight.payday_prediction.v2.enabled = true for user AND loan_id non-empty

Fetches up to 93 days of transactions from the Transactions Service, filters them to payroll-like candidates using four passes (non-integer amounts, recurring names, override keywords, blacklist exclusions), then runs a 5-phase DOW/DOM analysis pipeline to predict the cadence and next payday date. Does not use employment records or Pave. See FloatMe V2 Payday Analyzer for the full algorithm walkthrough.

Result Selection

After all algorithms run, the returned prediction is chosen in this order:

insight.payday_prediction.v2.rollout = true for user AND v2 produced a result
  └─▶ return V2 prediction

Pave produced a non-empty payday AND no Pave error
  └─▶ return Pave prediction

(fallback)
  └─▶ return FM Legacy prediction

All three results (and any errors) are logged for offline comparison via a Datadog metric (insight.payday.comparison, insight.payday.v2.comparison).

Persistence (when loan_id is provided)

Store What is written

prod-pave DynamoDB (payday entity)

The winning prediction (Pave or FM Legacy) along with both predictions' instant, standard, and extended collection payday dates. Written via PaydayRepo.SavePayday.

fmdatacapture DynamoDB

Three entries per evaluation:

  • PAYDAY_PREDICTION — full structured payload: all three predictions (where available), all errors, loan_id, prediction_date. TTL: 2 days.

  • FM_PAYDAY_FILTERED_TXNS#{date} — the list of payroll-candidate transactions used by V2 (one record per user per day).

  • FM_PAYDAY_PREDICTED_PAYDAY#{date} — the V2 predicted date and cadence (one record per user per day).

Collection Date Calculation

The FM Legacy predictor calculates four collection dates beyond the raw payday:

Field Buffer from request date

next_instant_payday

Next payday after today + 24h

next_standard_payday

Next payday after today + 96h

extended_instant_payday

Next payday after today + EXTENDED_PAYBACK_START_DAYS (configured via env var)

extended_standard_payday

Same as extended_instant_payday (currently identical)

GrowthBook Flags

Flag Key Default Effect

insight.pave.decom.enabled

false

Disables the Pave data path for forecasts entirely; always uses the FloatMe-generated path

insights.fm_forcasts.api.send_fm_generated

false

Per-user flag: when insight.pave.decom.enabled = false, computes both Pave and FloatMe forecasts, saves FloatMe result to CASHFLOW_ANALYTICS, and returns the FloatMe result to the caller

insight.payday_prediction.v2.enabled

false

Per-user flag: enables the V2 (transaction-history-based) payday algorithm during float creation

insight.payday_prediction.v2.rollout

false

Per-user flag: when true, returns V2 results to the caller instead of Pave/FM Legacy results