Fargate
The Float Service runs scheduled, long-running, or oversized jobs on Amazon ECS Fargate alongside its Lambda functions. Fargate is used where the 15-minute Lambda timeout is insufficient or where container-based packaging is preferable to the provided.al2023 Lambda runtime.
Current Tasks
| Task | Trigger | CPU | Memory | Purpose |
|---|---|---|---|---|
|
EventBridge Scheduler ( |
512 |
1024 |
Batch job that pages all |
All tasks run on the shared site-floats ECS cluster, in awsvpc network mode, on the FARGATE launch type with platform version LATEST. Tasks run in the FloatMe private subnets with the PrivateSG security group and assign_public_ip = false.
Deploy Flow
Container builds happen in GitHub Actions (.github/workflows/deploy.yaml), not inside the devkit — the devkit container has no docker-in-docker. The Makefile exposes two host targets:
-
make container-build-collections-jobs—docker buildx build --load, used for local smoke testing. -
make container-push-collections-jobs—docker buildx build --push, used by CI.CONTAINER_REGISTRYandCONTAINER_TAGare overridable; CI setsCONTAINER_REGISTRYto the ECR registry fromaws-actions/amazon-ecr-login(with/floatssuffix) andCONTAINER_TAGto the value ofTF_VAR_collections_jobs_image_tag.
CI Sequence
Target ordering (steady-state, after first-time ECR bootstrap):
-
make build— produces Lambda artifacts. -
(push to
main/ release only)aws-actions/amazon-ecr-loginandmake container-push-collections-jobs— buildx build with--platform=linux/amd64and--pushto ECR. -
terraform init/terraform plan(on PRs) orterraform apply(on push tomain/ release). On apply, the task definition references the image tag that was just pushed.
|
The current |
Image Tag Selection
| Environment | Trigger | TF_VAR_service_version |
|---|---|---|
|
PR or push to |
|
|
Release published |
Full semver tag (e.g., |
Tags in ECR are immutable (image_tag_mutability = "IMMUTABLE"), so every CI run produces a new tag and cannot overwrite a prior build.
Build-Time Inputs
The collections-jobs Dockerfile (cmd/collections-jobs/Dockerfile) is a two-stage build:
-
Builder —
ghcr.io/floatme-corp/golang:1.26-alpine. Pulls module dependencies using a--mount=type=secret,id=github_tokeninjected by Make (sourced fromGITHUB_TOKEN), used to clone privategithub.com/floatme-corpmodules. Compiles the binary withCGO_ENABLED=0 GOOS=linux GOARCH=amd64. Build version metadata is injected via-ldflagsfrom theGIT_VERSION,GIT_COMMIT,GIT_COMMIT_DATE, andGIT_COMMIT_TIMESTAMPbuild args. -
Runtime —
gcr.io/distroless/static-debian12:nonroot. The compiled binary is copied to/usr/local/bin/collections-jobsand used as the image entrypoint.
ECR Repositories
One repository per binary. Repositories are owned by this service and never destroyed by Terraform (lifecycle.prevent_destroy = true) so historical builds remain available for rollback.
| Repository | Notes |
|---|---|
|
Container image for the collections-jobs binary. |
Lifecycle policy on each repository:
-
Untagged images older than 7 days are expired (priority 1).
-
The most recent 5 images of any tag status are retained; older images are expired (priority 2).
ECS Cluster
A single cluster per environment hosts all Fargate tasks for the service.
-
Name:
site-floats(e.g.,prod-floats). -
No capacity providers configured — all tasks specify
FARGATEat run time.
IAM Roles
| Role | Purpose |
|---|---|
|
Task execution role attached to every task definition. Trusts |
|
Task role assumed by the collections-jobs container itself. Trusts |
|
Role assumed by EventBridge Scheduler to launch Fargate tasks. Trusts |
Task Definitions
| Family | Notes |
|---|---|
|
Three containers: |
Logging
The app container uses the awsfirelens log driver, routed by the FireLens log-router sidecar (public.ecr.aws/aws-observability/aws-for-fluent-bit:stable) directly to the Datadog logs intake at http-intake.logs.datadoghq.com. No CloudWatch Logs group is created. The Datadog API key is read from the site/datadog Secrets Manager secret (api_key JSON field) by ECS at container start via secretOptions.
Datadog tagging applied to log events:
-
dd_service = collections-jobs -
dd_source = go -
dd_tags = env:site,application:floats -
provider = ecs
APM and Metrics
The datadog-agent sidecar (public.ecr.aws/datadog/agent:7) provides APM trace ingestion (port 8126/tcp) and DogStatsD (port 8125/udp). Both ports listen on localhost only; in awsvpc mode all containers in the task share a network namespace, so the app container reaches the agent at localhost.
Sidecar configuration:
-
ECS_FARGATE = true— required for the agent to discover task metadata via the ECS metadata endpoint instead of expecting a node-level agent. -
DD_APM_ENABLED = true,DD_APM_NON_LOCAL_TRAFFIC = true— accept trace submissions from other containers in the task. -
DD_DOGSTATSD_NON_LOCAL_TRAFFIC = true— accept StatsD from other containers. -
DD_API_KEY— sourced from the samesite/datadogsecret as the FireLens config. -
Marked
essential = falseso a flaky agent does not fail the task; the app container hasdependsOnwith conditionHEALTHY, gating startup on the agent’sagent healthhealthcheck.
App container env vars (set on collections-jobs):
-
DD_AGENT_HOST = localhost,DD_TRACE_AGENT_PORT = 8126 -
DD_SERVICE = collections-jobs,DD_ENV = site,DD_VERSION = {var.service_version} -
DD_TRACE_ENABLED = true
These let the Go tracer (gopkg.in/DataDog/dd-trace-go.v1) auto-configure traces and correlate them with the FireLens-shipped logs (same dd_service).
|
The task is provisioned at 512 CPU / 1024 MB total, split across the app, FireLens sidecar, and Datadog Agent. Revisit if worker count or page size is scaled up significantly. |
Scheduled Jobs
EventBridge Scheduler (not classic CloudWatch Events / EventBridge rules) is used for Fargate task scheduling. Its native timezone support handles DST automatically, so we don’t need to recompute UTC offsets twice a year.
| Schedule | Cron | Purpose |
|---|---|---|
|
|
Triggers the day-before-ach Fargate task at 19:00 ET on weekdays, approximately two hours before the Usio 21:00 ET ACH cutoff. |
Subcommands and Flags
The collections-jobs binary dispatches on the first positional argument. Currently only one subcommand is implemented:
day-before-ach
Runs the day-before-ACH batch. The Fargate task definition passes ["day-before-ach"] as the container command. All flags have defaults suitable for production; override them in the task definition environment or via CLI.
| Flag | Default | Effect |
|---|---|---|
|
|
Payment service ACH submissions per second (token-bucket rate limiter shared across workers). Matches the Lambda path’s |
|
|
Number of goroutines consuming from the producer channel. Matches the Lambda path’s |
|
|
Number of floats per RDS page query. |
|
|
Producer→worker channel buffer depth. Defaults to the worker count. |
Terraform Files
| File | Contents |
|---|---|
|
ECS cluster, task execution IAM role, Datadog Secrets Manager data source, and execution-role policy attachments (AWS-managed plus the Datadog secret read). |
|
ECR repository and lifecycle policy for the collections-jobs binary. Image tag is |
|
Collections-jobs task role and the |
|
EventBridge Scheduler schedule for |
Related Pages
-
Infrastructure — Lambda functions, DynamoDB tables, SQS queues, and other AWS resources
-
Collections Engine — Scheduled and event-driven collection flows
-
ACH Processing — ACH settlement callbacks and Usio cutoff context