Infrastructure
The Subscription Service is deployed entirely on AWS using Terraform. All infrastructure is defined in deploy/ and managed per-environment (test, prod). The application identifier is subscription-service. Terraform state is stored in S3 with DynamoDB locking (terraform-locking table, key terraform/subscription-service/terraform.tfstate). AWS and Datadog providers are required; the AWS provider assumes a GitHub Actions role (site-github-actions-services-role) for all operations.
Lambda Functions
| Function | Trigger | Timeout | Memory | Key IAM Permissions |
|---|---|---|---|---|
|
API Gateway ( |
900s |
512 MB |
DynamoDB read/write/delete (billing-activity, billing-activity-history, locks); execute-api:Invoke (Payments, User Service, TXN Service, Underwriting); Secrets Manager (Segment, GrowthBook, AppsFlyer) |
|
SQS ( |
54s (90% of queue visibility timeout of 60s) |
default |
DynamoDB read/write (billing-activity, billing-activity-history, locks); SQS SendMessage/DeleteMessage/Receive (all three collection queues); execute-api:Invoke (TXN Service, Payments, User Service, Insight Service); SageMaker InvokeEndpoint; Secrets Manager (Segment, GrowthBook, Iterable, AppsFlyer) |
|
EventBridge rules ( |
810s (90% of paging queue visibility timeout of 900s) |
2048 MB |
SQS SendMessage (collections-scheduled, collections-retry, collections-pause); SQS full access on paging queues; DynamoDB Query (billing-activity) |
|
Kinesis ( |
840s |
default |
Kinesis DescribeStream/GetRecords/GetShardIterator/ListStreams (users stream); DynamoDB read/write/BatchWriteItem (billing-activity, billing-activity-history, locks); execute-api:Invoke (User Service) |
|
Kinesis ( |
840s |
default |
Kinesis DescribeStream/GetRecords/GetShardIterator/ListStreams (payments stream); DynamoDB read/write/BatchWriteItem (billing-activity, billing-activity-history, locks); execute-api:Invoke (User Service, Admin API, TXN Service); Secrets Manager (Segment, Iterable, AppsFlyer) |
|
DynamoDB Stream on |
840s |
default |
Kinesis PutRecord/PutRecords ( |
|
SQS ( |
54s (90% of queue visibility timeout of 60s) |
default |
DynamoDB read/write (billing-activity, billing-activity-history, locks); SQS DeleteMessage/GetQueueAttributes/ReceiveMessage/SendMessage; execute-api:Invoke (Payments, User Service); Secrets Manager (Segment) |
|
SQS ( |
810s (90% of queue visibility timeout of 900s) |
default |
DynamoDB read/write (billing-activity, billing-activity-history, locks); SQS DeleteMessage/GetQueueAttributes/ReceiveMessage; execute-api:Invoke (Payments, TXN Service, User Service); Secrets Manager (Segment, GrowthBook) |
|
SQS ( |
810s (90% of queue visibility timeout of 900s) |
default |
DynamoDB read/write (billing-activity, billing-activity-history, locks); SQS DeleteMessage/GetQueueAttributes/ReceiveMessage; execute-api:Invoke (Payments, User Service); Secrets Manager (Segment, GrowthBook, AppsFlyer) |
|
EventBridge rule ( |
810s (90% of paging queue visibility timeout of 900s) |
default |
DynamoDB Query (billing-activity); SQS DeleteMessage/GetQueueAttributes/ReceiveMessage/SendMessage (scheduler and worker queues). Enabled prod-only. |
|
SQS ( |
810s (90% of queue visibility timeout of 900s) |
default |
execute-api:Invoke (TXN Service, Payments, User Service); SQS DeleteMessage/GetQueueAttributes/ReceiveMessage; Secrets Manager (Iterable, GrowthBook) |
DynamoDB Tables
| Table | Region | Streams | Consumed By | Notes |
|---|---|---|---|---|
|
Configured via |
Yes — stream ARN used by kinesis-feeder |
api, collections-worker, collections-job, memberships, ach-handler, batch-worker, webhook-worker, webhook-balance-worker, notifier-scheduler |
Primary subscription state table. All subscription records and billing activity stored here. Stream drives the kinesis-feeder Lambda which publishes to |
|
Configured via |
No |
api, collections-worker, memberships, ach-handler, batch-worker, webhook-worker, webhook-balance-worker |
Historical billing activity records. Written alongside billing-activity for audit and history queries. |
|
Configured via |
No |
api, collections-worker, memberships, ach-handler, batch-worker, webhook-worker, webhook-balance-worker |
Distributed locking table (cirello.io/dynamolock pattern). All collection Lambdas acquire locks before processing to serialise concurrent attempts. Lock operations require GetItem, PutItem, UpdateItem, Query, and DeleteItem (for release). |
All three tables are in the legacy DynamoDB region (configured via dynamo_legacy_region). They pre-date per-environment namespacing and do not carry an environment prefix. A dedicated aws.dynamodb provider alias is used to deploy resources and read data sources in that region.
SQS Queues
| Queue | Visibility Timeout | Max Receive Count | DLQ | Purpose |
|---|---|---|---|---|
|
60s |
5 |
|
Receives scheduled subscription collection jobs from collections-job. Consumed by collections-worker (batch size 10, ReportBatchItemFailures). |
|
60s |
5 |
|
Receives retry collection jobs from collections-job. Consumed by collections-worker (batch size 10, ReportBatchItemFailures). |
|
60s |
5 |
|
Receives pause-state collection jobs from collections-job. Consumed by collections-worker (batch size 10, ReportBatchItemFailures). |
|
60s |
None configured |
None |
Receives manual batch collection requests. Consumed by batch-worker (batch size 10). |
|
900s |
1 |
|
Receives income detection events routed from EventBridge ( |
|
900s |
1 |
|
Receives balance update events routed from EventBridge ( |
|
900s |
5 |
|
Paging queue for notifier-scheduler. Used by notifier-scheduler to paginate through subscriptions requiring pre-subscription notifications (batch size 1). |
|
900s |
5 |
|
Receives individual user notification jobs from notifier-scheduler. Consumed by notifier-worker (batch size 10, max concurrency 5, ReportBatchItemFailures). |
|
900s |
5 |
|
Paging queue for collections-job scheduled run. Enables paginated DynamoDB scans for subscriptions due for scheduled collection (batch size 1). |
|
900s |
5 |
|
Paging queue for collections-job retry run. Enables paginated DynamoDB scans for subscriptions eligible for retry collection (batch size 1). |
|
900s |
5 |
|
Paging queue for collections-job pause run. Enables paginated DynamoDB scans for subscriptions in paused state eligible for collection (batch size 1). |
Kinesis Streams
| Stream | Direction | Producers | Consumers | Purpose |
|---|---|---|---|---|
|
Inbound (external) |
User Service |
|
Membership lifecycle events from the User Service. The memberships Lambda filters for 12 event types: |
|
Inbound (external) |
Payments Service |
|
ACH settlement events from the Payments Service. The ach-handler filters for 4 event types: |
|
Outbound (internal) |
|
Downstream consumers (e.g., other FloatMe services) |
Internal subscription event stream. The kinesis-feeder Lambda reads every change from the |
EventBridge
| Rule / Pattern | Bus | Purpose |
|---|---|---|
|
Default event bus |
Fires at 08:00 UTC Mon–Fri. Triggers |
|
Default event bus |
Fires at 07:00 UTC Mon–Fri. Triggers |
|
Default event bus |
Fires at 22:00 UTC Mon–Fri (17:00 CST). Triggers |
|
Default event bus |
Routes income detection events from the Insight Service where transaction amount is less than -$75.00 (i.e. |
|
Default event bus |
Routes balance update events from the TXN Service feeder for main accounts with a non-negative balance to the |
|
Default event bus |
Fires daily at 12:00 UTC. Triggers |
Secrets Manager
All secrets are namespaced by environment (site/…).
| Secret Path | Purpose |
|---|---|
|
Segment write key for analytics events. Used by api, collections-worker, ach-handler, webhook-worker, webhook-balance-worker, and batch-worker Lambdas. |
|
Iterable API key for transactional email and push notifications. Used by ach-handler and notifier-worker Lambdas. |
|
GrowthBook SDK key for feature flag evaluation. Used by api, collections-worker, webhook-worker, webhook-balance-worker, and notifier-worker Lambdas. |
|
AppsFlyer API key for mobile attribution events. Used by api, ach-handler, and webhook-balance-worker Lambdas. |
|
Datadog API and app keys. Used by Terraform only (not injected into Lambda environments) to configure Datadog SLOs and the service catalog entry. |
API Gateway
| Gateway | Auth | Purpose |
|---|---|---|
|
AWS IAM (SigV4) |
Internal API for all subscription management operations. All routes ( |
Scheduled Jobs
| Lambda | Schedule | Purpose |
|---|---|---|
|
|
Paginates through |
|
|
Paginates through |
|
|
Paginates through |
|
|
Paginates through |
Monitoring
Datadog SLOs are defined across three dimensions for the subscription-service Lambdas. All SLOs use service:subscription-service and env:site tags.
Error SLO
[AWS][site-subscription-service] Lambda Errors SLO — 99.9% target / 99.99% warning over 7-day and 30-day windows.
Covers: collections-worker, collections-job, memberships, ach-handler, api, kinesis-feeder, notifier-scheduler, notifier-worker, webhook-worker.
Throughput SLO
[AWS][site-subscription-service] Lambda Throughput SLO — 99.9% target / 99.99% warning over 7-day and 30-day windows.
Covers: collections-job, memberships, ach-handler, api, kinesis-feeder, notifier-scheduler, webhook-worker.
Latency SLO
[AWS][site-subscription-service] Lambda Latency SLO — 99.9% target / 99.99% warning over 7-day and 30-day windows.
Covers: ach-handler, api, kinesis-feeder, notifier-scheduler, webhook-worker.
The Datadog service catalog entry (datadog_service_definition_yaml) registers the service as tier 1, team devops, with links to the GitHub source repo and the Antora documentation site at https://docs.floatme.io/subscription-service.
Terraform Structure
All infrastructure is defined in deploy/:
| File | Contents |
|---|---|
|
Terraform version constraint ( |
|
AWS provider config with a |
|
All configurable parameters: |
|
|
|
All SQS queues and their DLQs: |
|
|
|
|
|
|
|
EventBridge rules for |
|
Secrets Manager data source references for |
|
Datadog provider configuration (reads credentials from the |
|
(empty — no outputs defined) |
Related Pages
-
Architecture — System context diagram and Lambda component overview
-
Event Flows — EventBridge events published and consumed, SQS queue flow details
-
ACH Processing — Kinesis-based ACH settlement callbacks
-
Collections Engine — How scheduled and webhook collection runs use these queues
-
DynamoDB Tables — Full schemas and access patterns for billing-activity and billing-activity-history