Architecture Decision Records (ADR)¶

Project: Forma3D.Connect
Version: 3.0
Last Updated: January 21, 2026

This document captures the significant architectural decisions made during the development of Forma3D.Connect.

Table of Contents¶

ADR-001: Monorepo with Nx
ADR-002: NestJS for Backend Framework
ADR-003: React 19 with Vite for Frontend
ADR-004: PostgreSQL with Prisma ORM
ADR-005: TypeScript Strict Mode
ADR-006: Azure DevOps for CI/CD
ADR-007: Layered Architecture with Repository Pattern
ADR-008: Event-Driven Internal Communication
ADR-009: OpenAPI/Swagger for API Documentation
ADR-010: HMAC Verification for Webhooks
ADR-011: Idempotent Webhook Processing
ADR-012: Assembly Parts Model for Product Mapping
ADR-013: Shared Domain Library
ADR-014: SimplyPrint as Unified Print Farm Controller
ADR-015: Aikido Security Platform
ADR-016: Sentry Observability with OpenTelemetry
ADR-017: Docker + Traefik Deployment Strategy
ADR-018: Nx Affected Conditional Deployment Strategy
ADR-019: SimplyPrint Webhook Verification
ADR-020: Hybrid Status Monitoring (Polling + Webhooks)
ADR-021: Retry Queue with Exponential Backoff
ADR-022: Event-Driven Fulfillment Architecture
ADR-023: Email Notification Strategy
ADR-024: API Key Authentication for Admin Endpoints
ADR-025: Cosign Image Signing for Supply Chain Security
ADR-026: CycloneDX SBOM Attestations
ADR-027: TanStack Query for Server State Management
ADR-028: Socket.IO for Real-Time Dashboard Updates
ADR-029: API Key Authentication for Dashboard
ADR-030: Sendcloud for Shipping Integration
ADR-031: Automated Container Registry Cleanup
ADR-032: Domain Boundary Separation with Interface Contracts
ADR-033: Database-Backed Webhook Idempotency
ADR-034: Docker Log Rotation & Resource Cleanup
ADR-035: Progressive Web App (PWA) for Cross-Platform Access
ADR-036: localStorage Fallback for PWA Install Detection
ADR-037: Keep a Changelog for Release Documentation
ADR-038: Zensical for Publishing Project Documentation

ADR-001: Monorepo with Nx¶

Attribute	Value
ID	ADR-001
Status	Accepted
Date	2026-01-09
Context	Need to manage multiple applications (API, Web, Desktop, Mobile) and shared libraries in a single repository

Decision¶

Use Nx (v19.x) as the monorepo management tool with pnpm as the package manager.

Rationale¶

Unified tooling: Single command to build, test, lint all projects
Dependency graph: Nx understands project dependencies and can run only affected tests
Caching: Local and remote caching speeds up CI/CD pipelines
Code sharing: Shared libraries (@forma3d/domain, @forma3d/utils, etc.) are first-class citizens
Plugin ecosystem: Built-in support for NestJS, React, and other frameworks

Consequences¶

✅ Fast CI through affected commands and caching
✅ Consistent tooling across all projects
✅ Easy code sharing via path aliases
⚠️ Learning curve for developers unfamiliar with Nx
⚠️ Initial setup complexity

Alternatives Considered¶

Alternative	Reason for Rejection
Turborepo	Less mature NestJS support
Lerna	Deprecated in favor of Nx
Separate repositories	Too much overhead for shared code

ADR-002: NestJS for Backend Framework¶

Attribute	Value
ID	ADR-002
Status	Accepted
Date	2026-01-09
Context	Need a robust, scalable backend framework for the integration API

Decision¶

Use NestJS (v10.x) as the backend framework.

Rationale¶

Enterprise-grade: Built-in support for dependency injection, modules, guards, interceptors
TypeScript-first: Native TypeScript support with decorators
Modular architecture: Easy to organize code by feature
Excellent documentation: Well-documented with active community
Testing support: Built-in testing utilities with Jest
OpenAPI support: First-class Swagger/OpenAPI integration via @nestjs/swagger

Consequences¶

✅ Clean, maintainable code structure
✅ Easy to add new features as modules
✅ Built-in validation with class-validator
✅ Excellent integration with Prisma
⚠️ Verbose compared to Express.js
⚠️ Decorator-heavy syntax

Alternatives Considered¶

Alternative	Reason for Rejection
Express.js	Too low-level, lacks structure
Fastify	Less ecosystem support
Hono	Too new, less enterprise features

ADR-003: React 19 with Vite for Frontend¶

Attribute	Value
ID	ADR-003
Status	Accepted
Date	2026-01-09
Context	Need a modern frontend framework for the admin dashboard

Decision¶

Use React 19 with Vite as the bundler and Tailwind CSS for styling.

Rationale¶

React 19: Latest version with improved performance and new features
Vite: Extremely fast development server and build times
Tailwind CSS: Utility-first CSS for rapid UI development
TanStack Query: Excellent server state management
React Router: Standard routing solution

Consequences¶

✅ Fast development experience with HMR
✅ Modern React features (Server Components ready)
✅ Consistent styling with Tailwind
⚠️ Tailwind learning curve for traditional CSS developers

Alternatives Considered¶

Alternative	Reason for Rejection
Next.js	Overkill for admin dashboard, SSR not needed
Angular	Less flexibility, steeper learning curve
Vue.js	Team expertise in React

ADR-004: PostgreSQL with Prisma ORM¶

Attribute	Value
ID	ADR-004
Status	Accepted
Date	2026-01-09
Context	Need a reliable database with type-safe access

Decision¶

Use PostgreSQL 16 as the database with Prisma 5 as the ORM.

Rationale¶

PostgreSQL: Robust, ACID-compliant, excellent JSON support
Prisma: Type-safe database access, auto-generated client
Schema-first: Prisma schema as single source of truth
Migrations: Built-in migration system
Studio: Visual database browser for development

Consequences¶

✅ Full type safety from database to API
✅ Easy schema changes with migrations
✅ No raw SQL in application code
⚠️ Prisma Client must be regenerated after schema changes
⚠️ Some complex queries require raw SQL

Schema Design Decisions¶

UUIDs for primary keys (portability, no sequence conflicts)
JSON columns for flexible data (shipping address, print profiles)
Decimal type for monetary values (precision)
Timestamps with timezone (audit trail)

ADR-005: TypeScript Strict Mode¶

Attribute	Value
ID	ADR-005
Status	Accepted
Date	2026-01-09
Context	Need to ensure code quality and catch errors early

Decision¶

Enable TypeScript strict mode with additional strict checks:

{
  "strict": true,
  "noImplicitAny": true,
  "strictNullChecks": true,
  "noUnusedLocals": true,
  "noUnusedParameters": true
}

Rationale¶

Early error detection: Catch type errors at compile time
Self-documenting code: Types serve as documentation
Refactoring safety: IDE can safely refactor with full type information
No any type: Prevents type escape hatches

Consequences¶

✅ Higher code quality
✅ Better IDE support and autocomplete
✅ Safer refactoring
⚠️ More verbose code with explicit types
⚠️ Stricter null checking requires careful handling

ADR-006: Azure DevOps for CI/CD with Digital Ocean Hosting¶

Attribute	Value
ID	ADR-006
Status	Accepted
Date	2026-01-09
Context	Need a CI/CD pipeline for automated testing and deployment

Decision¶

Use Azure DevOps Pipelines for CI/CD and Digital Ocean for hosting.

Rationale¶

Azure DevOps: Existing team expertise with YAML pipelines
Digital Ocean: Cost-effective, simple infrastructure for small-scale deployment
Separation of concerns: CI/CD tooling separate from hosting
Docker-based: Consistent container deployment across environments
Managed Database: Digital Ocean managed PostgreSQL for reliability

Infrastructure¶

Component	Service	Purpose
CI/CD	Azure DevOps Pipelines	Build, test, deploy automation
Container Registry	Digital Ocean Registry	Docker image storage
Staging	Digital Ocean Droplet	Staging environment
Production	Digital Ocean Droplet	Production environment
Database	Digital Ocean Managed PostgreSQL	Data persistence

Pipeline Stages¶

Validate: Lint and type check
Test: Unit tests with coverage, E2E tests
Build: Build all affected projects, push to DO Registry
Deploy Staging: Auto-deploy on develop branch
Deploy Production: Manual approval for main branch

Consequences¶

✅ Automated quality gates
✅ Fast feedback on PRs
✅ Consistent deployments
✅ Cost-effective hosting
⚠️ Need to manage Docker deployments on Droplets

ADR-007: Layered Architecture with Repository Pattern¶

Attribute	Value
ID	ADR-007
Status	Accepted
Date	2026-01-09
Context	Need a clean separation of concerns in the backend

Decision¶

Implement a layered architecture with the Repository Pattern:

uml diagram

Layer Responsibilities¶

Layer	Responsibility	Example
Controller	HTTP handling, validation, routing	`OrdersController`
Service	Business logic, orchestration	`OrdersService`
Repository	Data access, Prisma queries	`OrdersRepository`
DTO	Data transfer, validation	`CreateOrderDto`

Rationale¶

Testability: Each layer can be tested in isolation
Single responsibility: Clear separation of concerns
Flexibility: Easy to swap implementations (e.g., different databases)
Maintainability: Changes in one layer don't affect others

Consequences¶

✅ Clean, maintainable code
✅ Easy to unit test with mocks
✅ Prisma isolated to repository layer
⚠️ More files per feature
⚠️ Some boilerplate code

ADR-008: Event-Driven Internal Communication¶

Attribute	Value
ID	ADR-008
Status	✅ Implemented
Date	2026-01-09 (Updated: 2026-01-13)
Context	Need to decouple components and trigger actions on state changes

Decision¶

Use NestJS EventEmitter for internal event-driven communication.

Events Defined¶

Event	Trigger	Listeners
`order.created`	New order from Shopify webhook	OrchestrationService
`order.status_changed`	Order status update	EventLogService
`order.cancelled`	Order cancellation	PrintJobsService (cancels pending jobs)
`order.ready-for-fulfillment`	All print jobs completed	Future: FulfillmentService (Phase 3)
`order.fulfilled`	Order shipped	EventLogService
`order.failed`	Order processing failed	EventLogService
`simplyprint.job-status-changed`	SimplyPrint webhook/poll update	PrintJobsService
`printjob.created`	Print job created in SimplyPrint	EventLogService
`printjob.status_changed`	Print job status update	OrchestrationService
`printjob.completed`	Print job finished successfully	OrchestrationService (checks order completion)
`printjob.failed`	Print job failed	OrchestrationService (checks order failure)
`printjob.cancelled`	Print job cancelled	EventLogService
`printjob.retry_requested`	Print job retry initiated	EventLogService

Event Flow Diagram¶

Shopify Webhook → OrdersService → order.created
                                       ↓
                              OrchestrationService
                                       ↓
                              PrintJobsService → printjob.created
                                       ↓
                              SimplyPrint API

SimplyPrint Webhook → SimplyPrintService → simplyprint.job-status-changed
                                                  ↓
                                          PrintJobsService → printjob.status_changed
                                                  ↓                    ↓
                                          printjob.completed    printjob.failed
                                                  ↓                    ↓
                                          OrchestrationService ←───────┘
                                                  ↓
                                          order.ready-for-fulfillment (Phase 3)

Rationale¶

Decoupling: Services don't directly depend on each other
Extensibility: Easy to add new listeners
Async processing: Events can be processed asynchronously
Audit trail: Events naturally support logging
Orchestration: Clean separation between job creation and completion tracking

Consequences¶

✅ Loose coupling between modules
✅ Easy to add new functionality
✅ Clear event flow
✅ Enables reactive order completion tracking
⚠️ Harder to trace execution flow (mitigated by EventLogService)
⚠️ Eventual consistency considerations

ADR-009: OpenAPI/Swagger for API Documentation¶

Attribute	Value
ID	ADR-009
Status	Accepted
Date	2026-01-10
Context	Need interactive API documentation for developers

Decision¶

Use @nestjs/swagger for OpenAPI 3.0 documentation with Swagger UI.

Implementation¶

Swagger UI: Available at /api/docs
OpenAPI JSON: Available at /api/docs-json
Environment restriction: Only enabled in non-production
Decorator-based: All endpoints documented via decorators

Decorators Used¶

Decorator	Purpose
`@ApiTags`	Group endpoints by feature
`@ApiOperation`	Describe endpoint purpose
`@ApiResponse`	Document response schemas
`@ApiProperty`	Document DTO properties
`@ApiParam`	Document path parameters
`@ApiQuery`	Document query parameters

Consequences¶

✅ Interactive API testing
✅ Auto-generated documentation
✅ Type-safe documentation
⚠️ Must keep decorators in sync with code

ADR-010: HMAC Verification for Webhooks¶

Attribute	Value
ID	ADR-010
Status	Accepted
Date	2026-01-09
Context	Need to verify webhook requests are genuinely from Shopify

Decision¶

Implement HMAC-SHA256 signature verification for all Shopify webhooks.

Implementation¶

// ShopifyWebhookGuard
const hash = crypto.createHmac('sha256', webhookSecret).update(rawBody, 'utf8').digest('base64');

return crypto.timingSafeEqual(Buffer.from(hash), Buffer.from(hmacHeader));

Rationale¶

Security: Prevents forged webhook requests
Shopify standard: Required by Shopify webhook specification
Timing-safe comparison: Prevents timing attacks
Raw body access: NestJS configured to preserve raw body

Consequences¶

✅ Secure webhook endpoint
✅ Compliant with Shopify requirements
⚠️ Requires raw body access (special NestJS configuration)

ADR-011: Idempotent Webhook Processing¶

Attribute	Value
ID	ADR-011
Status	Accepted
Date	2026-01-09
Context	Shopify may send duplicate webhooks; need to handle gracefully

Decision¶

Implement idempotent webhook processing using:

Webhook ID tracking (in-memory Set)
Database unique constraints (shopifyOrderId)

Implementation¶

// ShopifyService
private readonly processedWebhooks = new Set<string>();

if (this.processedWebhooks.has(webhookId)) {
  return; // Skip duplicate
}
this.processedWebhooks.add(webhookId);

// OrdersService
const existing = await this.ordersRepository.findByShopifyOrderId(id);
if (existing) {
  return existing; // Return existing, don't create duplicate
}

Consequences¶

✅ No duplicate orders created
✅ Safe to retry failed webhooks
⚠️ In-memory Set resets on restart (database constraint is primary guard)

ADR-012: Assembly Parts Model for Product Mapping¶

Attribute	Value
ID	ADR-012
Status	Accepted
Date	2026-01-09
Context	A single Shopify product may require multiple 3D printed parts

Decision¶

Implement ProductMapping → AssemblyPart one-to-many relationship.

Data Model¶

uml diagram

Fields¶

ProductMapping: SKU, shopifyProductId, defaultPrintProfile
AssemblyPart: partName, partNumber, simplyPrintFileId, quantityPerProduct

Rationale¶

Flexibility: Support both single-part (1 part) and multi-part products
Quantity support: quantityPerProduct for parts needed multiple times (e.g., 4 wheels)
Print profiles: Override default profile per part if needed

Consequences¶

✅ Supports complex assemblies
✅ Clear part ordering via partNumber
✅ Flexible print settings per part
⚠️ More complex order processing logic

ADR-013: Shared Domain Library¶

Attribute	Value
ID	ADR-013
Status	Accepted
Date	2026-01-09
Context	Need to share types between frontend, backend, and external integrations

Decision¶

Create a shared @forma3d/domain library containing:

Entity types
Enums
Shopify types
Common interfaces

Structure¶

libs/domain/src/
├── entities/
│   ├── order.ts
│   ├── line-item.ts
│   ├── print-job.ts
│   └── product-mapping.ts
├── enums/
│   ├── order-status.ts
│   ├── line-item-status.ts
│   └── print-job-status.ts
├── shopify/
│   ├── shopify-order.entity.ts
│   └── shopify-product.entity.ts
└── index.ts

Rationale¶

Single source of truth: Types defined once, used everywhere
Type safety: Frontend and backend share exact same types
Nx integration: Clean imports via path aliases

Consequences¶

✅ Consistent types across codebase
✅ No type drift between frontend/backend
✅ Easy to update types in one place
⚠️ Must rebuild library on changes

ADR-014: SimplyPrint as Unified Print Farm Controller¶

Attribute	Value
ID	ADR-014
Status	✅ Implemented (Phase 2)
Date	2026-01-10 (Updated: 2026-01-13)
Context	Need to control multiple 3D printer brands (Prusa, Bambu Lab) from one API

Decision¶

Use SimplyPrint as the unified print farm management solution with an edge device connecting to all printers via LAN.

Architecture¶

uml diagram

Rationale¶

Unified API: Single integration point for all printer brands
LAN mode: Direct communication with printers, no cloud dependency for print control
Edge device: Handles printer communication, buffering, and monitoring
Multi-brand support: Prusa and Bambu Lab printers managed together
No Bambu Cloud dependency: Avoids Bambu Lab Cloud API limitations

Printer Support¶

Brand	Models	Connection
Prusa	MK3S+, XL, Mini	LAN via SimplyPrint edge device
Bambu Lab	X1 Carbon, P1S	LAN via SimplyPrint edge device

Implementation Details (Phase 2)¶

API Client (apps/api/src/simplyprint/simplyprint-api.client.ts):

HTTP Basic Authentication with Company ID and API Key
Typed methods for files, jobs, printers, and queue operations
Automatic connection verification on startup
Sentry integration for 5xx error tracking

Webhook Controller (apps/api/src/simplyprint/simplyprint-webhook.controller.ts):

Endpoint: POST /webhooks/simplyprint
X-SP-Token verification via guard
Event-driven status updates

Print Jobs Service (apps/api/src/print-jobs/print-jobs.service.ts):

Creates print jobs in SimplyPrint when orders arrive
Updates local status based on SimplyPrint events
Supports cancel and retry operations

API Endpoints Used:

Endpoint	Method	Purpose
`/{companyId}/files/List`	GET	List available print files
`/{companyId}/printers/Get`	GET	Get printer statuses
`/{companyId}/printers/actions/CreateJob`	POST	Create new print job
`/{companyId}/printers/actions/Cancel`	POST	Cancel active job
`/{companyId}/queue/GetItems`	GET	Get queue items
`/{companyId}/queue/AddItem`	POST	Add item to queue
`/{companyId}/queue/RemoveItem`	POST	Remove from queue

Consequences¶

✅ Single API for all printers
✅ No dependency on Bambu Lab Cloud
✅ Local network resilience
✅ Real-time printer status via edge device
✅ Typed API client with full error handling
✅ Webhook and polling support for status updates
⚠️ Requires edge device on print farm network
⚠️ SimplyPrint subscription required

ADR-015: Aikido Security Platform for Continuous Security Monitoring¶

Attribute	Value
ID	ADR-015
Status	Accepted
Date	2026-01-10
Context	Need continuous security monitoring, vulnerability scanning, and SBOM generation

Decision¶

Use Aikido Security Platform as the centralized security monitoring and compliance solution integrated into the CI/CD pipeline.

Security Checks Implemented¶

Check	Status	Description
Open Source Dependency Monitoring	Active	Monitors 3^rd party dependencies for vulnerabilities
Exposed Secrets Monitoring	Compliant	Detects accidentally exposed secrets in source code
License Management	Compliant	Validates dependency licenses for legal compliance
SAST	Compliant	Static Application Security Testing
IaC Testing	Compliant	Infrastructure as Code security analysis
Malware Detection	Compliant	Detects malware in dependencies
Mobile Issues	Compliant	Mobile manifest file monitoring
SBOM Generation	Active	Software Bill of Materials for supply chain security

Rationale¶

Comprehensive coverage: Single platform covers multiple security domains
CI/CD integration: Automated scanning on every code change
SBOM generation: Critical for supply chain security and compliance
License compliance: Automated license validation prevents legal issues
Developer-friendly: Clear dashboards and actionable remediation guidance
Proactive detection: Continuous monitoring catches issues before production

Future Enhancements¶

Code Quality Analysis: Will be enabled in a subsequent phase to complement security scanning

Consequences¶

✅ Continuous security visibility across the codebase
✅ Automated vulnerability detection in dependencies
✅ SBOM generation for supply chain transparency
✅ License compliance validation
✅ Secrets exposure prevention
⚠️ Requires Aikido platform subscription
⚠️ May flag false positives requiring triage

Alternatives Considered¶

Alternative	Reason for Rejection
Snyk	More expensive, less comprehensive for our needs
GitHub Advanced Security	Limited to GitHub, not as comprehensive
Manual audits	Not scalable, too slow for continuous delivery
Dependabot only	Only covers dependency vulnerabilities, not comprehensive

ADR-016: Sentry Observability with OpenTelemetry ✅¶

Attribute	Value
ID	ADR-016
Status	✅ Implemented
Date	2026-01-10
Context	Need comprehensive observability: error tracking, performance monitoring, distributed tracing

Decision¶

Use Sentry as the observability platform with an OpenTelemetry-first architecture for vendor neutrality.

Architecture¶

uml diagram

Implementation Details¶

Backend (NestJS):

@sentry/nestjs for error tracking and performance
@sentry/profiling-node for profiling
nestjs-pino for structured JSON logging
OpenTelemetry auto-instrumentation for Prisma queries
Global exception filter with Sentry capture
Logging interceptor with correlation IDs

Frontend (React):

@sentry/react for error tracking
Custom ErrorBoundary component with Sentry integration
Browser tracing for page navigation
User-friendly error fallback UI

Sampling Configuration (Free Tier Compatible): | Environment | Traces | Profiles | Errors | |-------------|--------|----------|--------| | Development | 100% | 100% | 100% | | Production | 10% | 10% | 100% |

Rationale¶

Sentry: Industry-leading error tracking with excellent stack trace support
OpenTelemetry: Vendor-neutral instrumentation standard, future-proof
Structured Logging: JSON logs enable log aggregation and searching
Correlation IDs: End-to-end request tracing across frontend and backend
Free Tier: Sufficient for small-scale production (10K errors/month)

Data Privacy¶

Sensitive data is automatically scrubbed:

Authorization headers
Cookies
API tokens
Passwords
Shopify access tokens

Implementation Details (Phase 1b)¶

Backend (apps/api):

instrument.ts - Sentry initialization with profiling (imported first in main.ts)
ObservabilityModule - Global module with Pino logger and Sentry integration
SentryExceptionFilter - Captures all exceptions with request context
LoggingInterceptor - Request/response logging with correlation IDs
ObservabilityController - Test endpoints for verifying observability (non-prod only)
Prisma service enhanced with Sentry breadcrumbs for query tracing

Frontend (apps/web):

sentry.ts - Sentry initialization with browser tracing and session replay
ErrorBoundary.tsx - React error boundary with Sentry integration

Shared Library (libs/observability):

sentry.config.ts - Shared Sentry configuration with 100% sampling
otel.config.ts - OpenTelemetry configuration
constants.ts - Trace/request ID header constants

Sampling Decision:

100% sampling for all environments (traces and profiles)
Rationale: Full visibility needed during early development
Can be reduced when traffic increases and limits are reached

Consequences¶

✅ Comprehensive error visibility with stack traces and context
✅ Performance monitoring for API endpoints and database queries
✅ Distributed tracing across frontend and backend
✅ Structured logs with correlation IDs for debugging
✅ Vendor-neutral instrumentation via OpenTelemetry
✅ Test endpoints for verifying observability in development
⚠️ Requires Sentry account (free tier available)
⚠️ Must initialize Sentry before other imports in main.ts
⚠️ 100% sampling may hit free tier limits with high traffic

Alternatives Considered¶

Alternative	Reason for Rejection
Datadog	Expensive for small-scale, overkill for current needs
New Relic	Expensive, complex pricing model
Grafana + Loki	Requires self-hosting, more operational overhead
ELK Stack	Complex to set up and maintain, expensive at scale
Console.log only	No centralized visibility, hard to debug production issues

ADR-017: Docker + Traefik Deployment Strategy¶

Attribute	Value
ID	ADR-017
Status	⏳ In Progress
Date	2026-01-10
Context	Need a deployment strategy for staging/production on DigitalOcean with TLS and zero-downtime

Decision¶

Use Docker Compose with Traefik reverse proxy for deploying to DigitalOcean Droplets.

Architecture¶

uml diagram

Deployment Components¶

Component	Technology	Purpose
Reverse Proxy	Traefik v3	TLS termination, routing, load balancing
TLS Certificates	Let's Encrypt	Automatic certificate issuance/renewal
Container Orchestration	Docker Compose	Service definition and networking
Image Registry	DigitalOcean Registry	Private Docker image storage
Database	DO Managed PostgreSQL	Persistent data storage with TLS

Traefik Configuration¶

Feature	Implementation
Entry Points	HTTP (:80) with redirect to HTTPS (:443)
Certificate Resolver	Let's Encrypt with HTTP challenge
Service Discovery	Docker labels on containers
Health Checks	HTTP health endpoints (/health, /health/live, /health/ready)
Logging	JSON format for log aggregation

Staging URLs¶

Service	URL
API	`https://staging-connect-api.forma3d.be`
Web	`https://staging-connect.forma3d.be`

Pipeline Integration¶

Stage	Trigger	Action
Package	develop branch	Build Docker images, push to DO Registry
Deploy Staging	develop branch	SSH + docker compose up
Deploy Production	main branch	Manual approval + SSH deploy

Image Tagging Strategy¶

Tag Format	Example	Purpose
Pipeline Instance	`20260110143709`	Immutable deployment reference
Latest	`latest`	Convenience for development

Database Migration Strategy¶

Prisma migrations run before container deployment:

# Executed in pipeline before docker compose up
docker compose run --rm api npx prisma migrate deploy

Rationale¶

Traefik: Automatic TLS, Docker-native, label-based configuration
Docker Compose: Simple, declarative, easy to understand
SSH deployment: Direct control, no additional orchestration overhead
Managed PostgreSQL: Reliability, automated backups, TLS built-in
Let's Encrypt: Free, automated TLS certificates

Zero-Downtime Deployment¶

# Pull new images
docker compose pull

# Run migrations (idempotent)
docker compose run --rm api npx prisma migrate deploy

# Start new containers (Compose handles replacement)
docker compose up -d --remove-orphans

# Clean up old images
docker image prune -f

Consequences¶

✅ Automatic TLS certificate management
✅ Simple deployment via SSH + Docker Compose
✅ Zero-downtime container replacement
✅ Docker labels for routing configuration
✅ Consistent image tagging with pipeline ID
⚠️ Single droplet = single point of failure (acceptable for staging)
⚠️ Requires manual SSH key management in Azure DevOps

Alternatives Considered¶

Alternative	Reason for Rejection
Kubernetes	Overkill for current scale, operational complexity
Docker Swarm	Less ecosystem support, not needed for single-node
Nginx	Manual certificate management, less dynamic
Caddy	Less mature Docker integration than Traefik
DigitalOcean App Platform	Less control, higher cost

ADR-018: Nx Affected Conditional Deployment Strategy¶

Attribute	Value
ID	ADR-018
Status	✅ Implemented
Date	2026-01-11
Context	Need to avoid unnecessary Docker builds and deployments when only part of the codebase changes

Decision¶

Use Nx affected to detect which applications have changed and conditionally run package/deploy stages only for affected apps.

Architecture¶

uml diagram

Pipeline Parameters¶

Parameter	Type	Default	Purpose
`ForceFullVersioningAndDeployment`	boolean	`true`	Bypass affected detection, deploy all apps
`breakingMigration`	boolean	`false`	Stop API before migrations

How Affected Detection Works¶

The pipeline runs pnpm nx show projects --affected --type=app to identify which applications have changed compared to the base branch (origin/main).

Scenarios:

Change Location	API Affected	Web Affected	Reason
`apps/api/**`	✅	❌	Only API code changed
`apps/web/**`	❌	✅	Only Web code changed
`libs/domain/**`	✅	✅	Shared library affects both apps
`libs/api-client/**`	❌	✅	API client only used by Web
`prisma/**`	✅	❌	Database schema affects API
`docs/*`, `.md`	❌	❌	Docs are published as a separate static site (Zensical)

Migration Safety¶

The deployment follows a specific order to ensure database safety:

Pull new images (uses latest code with new Prisma schema)
Stop API (only if breakingMigration=true)
Run migrations (using new image via docker compose run --rm)
Start API (after migrations complete)

Migration Types:

Migration Type	Safe During Old API?	Recommended Action
Add nullable column	✅ Safe	Normal deployment
Add column with default	✅ Safe	Normal deployment
Add new table	✅ Safe	Normal deployment
Drop column	❌ Dangerous	Use `breakingMigration=true`
Rename column	❌ Dangerous	Use `breakingMigration=true`
Add non-nullable column	❌ Dangerous	Use `breakingMigration=true`

Rationale¶

Efficiency: Avoid building/pushing Docker images when code hasn't changed
Cost reduction: Fewer container registry pushes, less storage used
Faster deployments: Only affected services are restarted
Cleaner versioning: New version tags only when actual code changes
Nx integration: Leverages existing monorepo tooling for dependency detection

Consequences¶

✅ Significantly faster CI/CD for partial changes
✅ Reduced container registry costs
✅ Cleaner deployment history (versions reflect actual changes)
✅ Safe migration order (migrations before restart)
✅ Support for breaking migrations with explicit parameter
✅ Override available via ForceFullVersioningAndDeployment parameter
⚠️ First pipeline run on new branch may show all apps affected
⚠️ Shared library changes trigger both app deployments (by design)
⚠️ breakingMigration requires manual assessment of migration type

Alternatives Considered¶

Alternative	Reason for Rejection
Always build both apps	Wasteful, slow, unnecessary version proliferation
Manual selection of apps	Error-prone, requires human decision each time
Git diff on Dockerfiles only	Misses shared library changes
Separate pipelines per app	Loses monorepo benefits, harder to maintain

ADR-019: SimplyPrint Webhook Verification¶

Attribute	Value
ID	ADR-019
Status	✅ Implemented
Date	2026-01-13
Context	Need to verify webhook requests are genuinely from SimplyPrint

Decision¶

Implement X-SP-Token header verification with timing-safe comparison for all SimplyPrint webhooks.

Implementation¶

// SimplyPrintWebhookGuard
@Injectable()
export class SimplyPrintWebhookGuard implements CanActivate {
  canActivate(context: ExecutionContext): boolean {
    const request = context.switchToHttp().getRequest();
    const token = request.headers['x-sp-token'];

    if (!this.webhookSecret) {
      this.logger.warn('SimplyPrint webhook secret not configured, skipping verification');
      return true;
    }

    if (!token) {
      throw new UnauthorizedException('Missing X-SP-Token header');
    }

    // Timing-safe comparison to prevent timing attacks
    const tokenBuffer = Buffer.from(token);
    const secretBuffer = Buffer.from(this.webhookSecret);

    if (tokenBuffer.length !== secretBuffer.length) {
      throw new UnauthorizedException('Invalid SimplyPrint webhook signature');
    }

    if (!crypto.timingSafeEqual(tokenBuffer, secretBuffer)) {
      throw new UnauthorizedException('Invalid SimplyPrint webhook signature');
    }

    return true;
  }
}

Rationale¶

Security: Prevents forged webhook requests
SimplyPrint standard: Uses the X-SP-Token header as per SimplyPrint documentation
Timing-safe comparison: Prevents timing attacks on secret comparison
Graceful degradation: Allows bypassing verification in development when secret not configured

Webhook Endpoint¶

Endpoint	Method	Purpose
`/webhooks/simplyprint`	POST	Receive SimplyPrint events

Supported Events¶

Event	Action
`job.started`	Update job status to PRINTING
`job.done`	Update job status to COMPLETED
`job.failed`	Update job status to FAILED
`job.cancelled`	Update job status to CANCELLED
`job.paused`	Keep as PRINTING (temporary state)
`job.resumed`	Keep as PRINTING
`printer.*`	Ignored (no job status change)

Consequences¶

✅ Secure webhook endpoint
✅ Protection against timing attacks
✅ Clear event-to-status mapping
✅ Development-friendly (optional verification)
⚠️ Requires SIMPLYPRINT_WEBHOOK_SECRET environment variable

ADR-020: Hybrid Status Monitoring (Polling + Webhooks)¶

Attribute	Value
ID	ADR-020
Status	✅ Implemented
Date	2026-01-13
Context	Need reliable print job status updates even if webhooks fail or are delayed

Decision¶

Implement a hybrid approach using both SimplyPrint webhooks (primary) and periodic polling (fallback) for job status monitoring.

Architecture¶

┌─────────────────────────────────────────────────────────────────┐
│                     Status Update Sources                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   SimplyPrint Cloud                                              │
│         │                                                        │
│         ├─── Webhooks (Primary, Real-time) ───┐                  │
│         │    • Immediate notification         │                  │
│         │    • Event: job.started/done/failed │                  │
│         │                                     ▼                  │
│         │                            SimplyPrintService          │
│         │                                     │                  │
│         └─── Polling (Fallback, 30s) ────────►│                  │
│              • @Cron every 30 seconds         │                  │
│              • Checks queue and printers      │                  │
│              • Catches missed webhooks        ▼                  │
│                                    simplyprint.job-status-changed│
│                                               │                  │
│                                               ▼                  │
│                                       PrintJobsService           │
│                                               │                  │
│                                               ▼                  │
│                                        Database Update           │
└─────────────────────────────────────────────────────────────────┘

Implementation¶

Webhook Handler (Primary):

async handleWebhook(payload: SimplyPrintWebhookPayload): Promise<void> {
  const jobData = payload.data.job;
  if (!jobData) return;

  const newStatus = this.mapWebhookEventToStatus(payload.event);
  if (!newStatus) return;

  this.eventEmitter.emit(SIMPLYPRINT_EVENTS.JOB_STATUS_CHANGED, {
    simplyPrintJobId: jobData.uid,
    newStatus,
    printerId: payload.data.printer?.id,
    printerName: payload.data.printer?.name,
    timestamp: new Date(payload.timestamp * 1000),
  });
}

Polling Fallback:

@Cron(CronExpression.EVERY_30_SECONDS)
async pollJobStatuses(): Promise<void> {
  if (!this.pollingEnabled || this.isPolling) return;

  this.isPolling = true;
  try {
    const printers = await this.simplyPrintClient.getPrinters();

    for (const printer of printers) {
      if (printer.currentJobId && printer.status === 'printing') {
        this.eventEmitter.emit(SIMPLYPRINT_EVENTS.JOB_STATUS_CHANGED, {
          simplyPrintJobId: printer.currentJobId,
          newStatus: PrintJobStatus.PRINTING,
          printerId: printer.id,
          printerName: printer.name,
          timestamp: new Date(),
        });
      }
    }
  } finally {
    this.isPolling = false;
  }
}

Configuration¶

Environment Variable	Default	Description
`SIMPLYPRINT_POLLING_ENABLED`	`true`	Enable/disable polling fallback
`SIMPLYPRINT_POLLING_INTERVAL_MS`	`30000`	Polling interval in milliseconds

Rationale¶

Reliability: Webhooks can fail due to network issues, SimplyPrint outages, or configuration problems
Real-time updates: Webhooks provide immediate notification when status changes
Consistency: Polling catches any status changes that webhooks might miss
Idempotency: Status updates check current status before updating, preventing duplicate updates
Configurable: Polling can be disabled in environments where webhooks are reliable

Status Deduplication¶

The system handles duplicate status updates gracefully:

async updateJobStatus(simplyPrintJobId: string, newStatus: PrintJobStatus): Promise<PrintJob> {
  const printJob = await this.findBySimplyPrintJobId(simplyPrintJobId);

  // Skip if status unchanged (idempotent)
  if (printJob.status === newStatus) {
    return printJob;
  }

  // Update and emit events
  // ...
}

Consequences¶

✅ High reliability for status updates
✅ Real-time updates via webhooks
✅ Catches missed webhooks via polling
✅ Configurable polling interval
✅ Idempotent status updates
⚠️ Polling adds API calls every 30 seconds (minimal overhead)
⚠️ Potential for slight delay if only relying on polling

Alternatives Considered¶

Alternative	Reason for Rejection
Webhooks only	Single point of failure, missed events cause stale status
Polling only	Higher latency, unnecessary API calls when webhooks work
WebSocket connection	SimplyPrint doesn't offer WebSocket API
Manual refresh button	Poor UX, requires operator intervention

ADR-021: Retry Queue with Exponential Backoff¶

Attribute	Value
ID	ADR-021
Status	✅ Implemented
Date	2026-01-14
Context	Need to handle transient failures in external API calls (Shopify, SimplyPrint) gracefully

Decision¶

Implement a database-backed retry queue with exponential backoff and jitter for all retryable operations.

Configuration¶

Setting	Value	Description
Max Retries	5	Maximum retry attempts
Initial Delay	1 second	First retry delay
Max Delay	1 hour	Maximum retry delay
Backoff Multiplier	2	Exponential growth factor
Jitter	±10%	Randomization to prevent thundering herd
Cleanup	7 days	Old completed jobs deleted

Implementation¶

calculateDelay(attempt: number): number {
  let delay = initialDelayMs * Math.pow(backoffMultiplier, attempt - 1);
  delay = Math.min(delay, maxDelayMs);
  const jitter = delay * 0.1 * (Math.random() * 2 - 1);
  return Math.round(delay + jitter);
}

Supported Job Types¶

Job Type	Description
`FULFILLMENT`	Shopify fulfillment creation
`PRINT_JOB_CREATION`	SimplyPrint job creation
`CANCELLATION`	Job cancellation operations
`NOTIFICATION`	Email notification sending

Consequences¶

✅ Automatic recovery from transient failures
✅ Prevents thundering herd with jitter
✅ Persistent queue survives application restarts
✅ Failed jobs trigger operator alerts
⚠️ Adds database table for queue persistence

ADR-022: Event-Driven Fulfillment Architecture¶

Attribute	Value
ID	ADR-022
Status	✅ Implemented
Date	2026-01-14
Context	Need to automatically create Shopify fulfillments when all print jobs complete

Decision¶

Use NestJS Event Emitter to trigger fulfillment creation when the orchestration service determines all print jobs for an order are complete.

Event Flow¶

PrintJob.COMPLETED → OrchestrationService checks all jobs
                   → If all complete: emit order.ready-for-fulfillment
                   → FulfillmentService listens and creates Shopify fulfillment

Key Events¶

Event	Producer	Consumer
`order.ready-for-fulfillment`	OrchestrationService	FulfillmentService
`fulfillment.created`	FulfillmentService	NotificationsService
`fulfillment.failed`	FulfillmentService	NotificationsService
`order.cancelled`	OrdersService	CancellationService

Consequences¶

✅ Loose coupling between order management and fulfillment
✅ Easy to add additional listeners (logging, analytics)
✅ Failure in fulfillment doesn't block order completion
⚠️ Event ordering not guaranteed (acceptable for this use case)

ADR-023: Email Notification Strategy¶

Attribute	Value
ID	ADR-023
Status	✅ Implemented
Date	2026-01-14
Context	Need to alert operators when automated processes fail and require attention

Decision¶

Implement email notifications via SMTP using Nodemailer with Handlebars templates for operator alerts.

Notification Triggers¶

Trigger	Severity	Description
Fulfillment failed (final)	ERROR	Fulfillment failed after max retries
Print job failed (final)	ERROR	Print job failed after max retries
Cancellation needs review	WARNING	Order cancelled with in-progress prints
Retry exhausted	ERROR	Any retry job exceeded max attempts

Configuration¶

SMTP_HOST=smtp.example.com
SMTP_PORT=587
SMTP_USER=notifications@forma3d.be
SMTP_PASS=***
SMTP_FROM=noreply@forma3d.be
OPERATOR_EMAIL=operator@forma3d.be
NOTIFICATIONS_ENABLED=true

Consequences¶

✅ Operators notified of issues requiring attention
✅ Email templates are maintainable and customizable
✅ Graceful degradation if email unavailable
✅ Can be disabled in development
⚠️ Requires SMTP configuration for each environment

ADR-024: API Key Authentication for Admin Endpoints¶

Attribute	Value
ID	ADR-024
Status	✅ Implemented
Date	2026-01-14
Context	Admin endpoints (fulfillment, cancellation) need protection from unauthorized access

Decision¶

Implement API key authentication using a custom NestJS guard for all admin endpoints that modify order state.

Implementation¶

// ApiKeyGuard
@Injectable()
export class ApiKeyGuard implements CanActivate {
  canActivate(context: ExecutionContext): boolean {
    if (!this.isEnabled) return true; // Development mode

    const request = context.switchToHttp().getRequest();
    const providedKey = request.headers['x-api-key'];

    if (!providedKey) {
      throw new UnauthorizedException('API key required');
    }

    // Timing-safe comparison to prevent timing attacks
    if (!crypto.timingSafeEqual(Buffer.from(providedKey), Buffer.from(this.apiKey))) {
      throw new UnauthorizedException('Invalid API key');
    }

    return true;
  }
}

Protected Endpoints¶

Endpoint	Method	Purpose
`/api/v1/fulfillments/order/:orderId`	POST	Create fulfillment
`/api/v1/fulfillments/order/:orderId/force`	POST	Force fulfill order
`/api/v1/fulfillments/order/:orderId/status`	GET	Get fulfillment status
`/api/v1/cancellations/order/:orderId`	POST	Cancel order
`/api/v1/cancellations/print-job/:jobId`	POST	Cancel single print job

Authentication Methods Summary¶

Endpoint Type	Method	Header	Verification
Shopify Webhooks	HMAC-SHA256 Signature	`X-Shopify-Hmac-Sha256`	Timing-safe comparison
SimplyPrint Webhooks	Token Verification	`X-SP-Token`	Timing-safe comparison
Admin Endpoints	API Key	`X-API-Key`	Timing-safe comparison
Public Endpoints	None	-	-

Configuration¶

# Generate secure API key
openssl rand -hex 32

# Environment variable
INTERNAL_API_KEY="your-secure-api-key"

Security Considerations¶

Timing-safe comparison: Prevents timing attacks on key validation
Generic error messages: Returns "API key required" or "Invalid API key" to prevent information leakage
Audit logging: Access attempts are logged for security monitoring
Development mode: If INTERNAL_API_KEY not set, endpoints are accessible (development only)

Rationale¶

IDOR Prevention: Addresses Insecure Direct Object Reference (IDOR) vulnerabilities flagged by security scanners
Defense in Depth: Additional layer of protection for sensitive operations
Simple Implementation: API keys are stateless and easy to rotate
Swagger Integration: API key documented in OpenAPI spec for easy testing

Consequences¶

✅ Protection against unauthorized access to admin functions
✅ IDOR vulnerability mitigated
✅ Timing-safe implementation prevents timing attacks
✅ Development-friendly (optional in dev mode)
✅ Documented in Swagger UI
⚠️ Requires secure key management in production
⚠️ Key must be rotated if compromised

Alternatives Considered¶

Alternative	Reason for Rejection
OAuth 2.0 / JWT	Overkill for internal B2B system with no user accounts
IP Whitelisting	Too inflexible, requires network configuration
mTLS	Complex certificate management for simple use case
No authentication	Unacceptable security risk (IDOR vulnerability)

ADR-025: Cosign Image Signing for Supply Chain Security¶

Attribute	Value
ID	ADR-025
Status	✅ Implemented
Date	2026-01-14
Context	Need to cryptographically sign container images and create attestations for promotion tracking

Decision¶

Implement key-based container image signing using cosign from the Sigstore project, with attestations to track image promotions through environments.

Architecture¶

┌─────────────────────────────────────────────────────────────────────────┐
│                        Azure DevOps Pipeline                             │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  Build & Package          Acceptance Test           Production           │
│  ┌─────────────┐          ┌─────────────┐          ┌─────────────┐      │
│  │ Build Docker│          │ Deploy to   │          │ Verify      │      │
│  │ Images      │          │ Staging     │          │ Staging     │      │
│  └──────┬──────┘          └──────┬──────┘          │ Attestation │      │
│         │                        │                 └──────┬──────┘      │
│         ▼                        ▼                        │             │
│  ┌─────────────┐          ┌─────────────┐                 ▼             │
│  │ Sign with   │          │ Run Tests   │          ┌─────────────┐      │
│  │ cosign.key  │          └──────┬──────┘          │ Deploy to   │      │
│  └──────┬──────┘                 │                 │ Production  │      │
│         │                        ▼                 └──────┬──────┘      │
│         │                 ┌─────────────┐                 │             │
│         │                 │ Create      │                 ▼             │
│         │                 │ Staging     │          ┌─────────────┐      │
│         │                 │ Attestation │          │ Create Prod │      │
│         │                 └─────────────┘          │ Attestation │      │
│         │                                          └─────────────┘      │
└─────────┼──────────────────────────────────────────────────────────────┘
          │
          ▼
┌─────────────────────────────────────┐     ┌─────────────────────┐
│   DigitalOcean Container Registry   │     │     Repository      │
│   ─────────────────────────────────│     │  ─────────────────  │
│   • Image:tag                       │     │  • cosign.pub       │
│   • Image.sig (signature)           │◄────│    (public key)     │
│   • Image.att (attestation)         │     │                     │
└─────────────────────────────────────┘     └─────────────────────┘

Implementation¶

Key-Based Signing (chosen approach):

# Azure DevOps Pipeline
- task: DownloadSecureFile@1
  name: cosignKey
  inputs:
    secureFile: 'cosign.key'

- script: |
    cosign sign \
      --key $(cosignKey.secureFilePath) \
      --annotations "build.number=$(imageTag)" \
      $(dockerRegistry)/$(imageName)@$(digest)
  env:
    COSIGN_PASSWORD: $(COSIGN_PASSWORD)

Attestation for Promotion Tracking:

{
  "_type": "https://forma3d.com/attestations/promotion/v1",
  "environment": "staging",
  "promotedAt": "2026-01-14T16:00:00+00:00",
  "build": {
    "number": "20260114160000",
    "pipeline": "forma-3d-connect",
    "commit": "abc123..."
  },
  "verification": {
    "healthCheckPassed": true,
    "acceptanceTestsPassed": true
  }
}

Key Management¶

File	Location	Purpose
`cosign.key`	Azure DevOps Secure Files	Sign images (private)
`COSIGN_PASSWORD`	Azure DevOps Variable Group	Decrypt private key
`cosign.pub`	Repository root (`/cosign.pub`)	Verify signatures (public)

Signing Workflow¶

Stage	Action	Artifact Created
Build & Package	Sign image after push	Image signature (.sig)
Staging Deploy	Create staging attestation	Staging attestation (.att)
Production Deploy	Verify staging attestation, then sign	Production attestation

Rationale¶

Supply chain security: Cryptographic proof that images were built by the CI/CD pipeline
Promotion tracking: Attestations provide audit trail without modifying image tags
Tamper detection: Modifications to signed images are detectable
Key-based over keyless: Keyless (OIDC) requires workload identity federation which adds complexity; key-based is simpler and fully functional in Azure DevOps

Why Key-Based Instead of Keyless¶

Sigstore's "keyless" signing uses OIDC tokens from identity providers (GitHub Actions, Google Cloud, etc.). While elegant, it has challenges in Azure DevOps:

Approach	Pros	Cons
Keyless (OIDC)	No key management, identity-based	Requires Azure Workload Identity Federation, falls back to device flow in CI (fails)
Key-Based	Works immediately in any CI	Requires secure key storage and rotation

We chose key-based because:

Azure DevOps doesn't have native OIDC integration with Sigstore
Device flow authentication cannot work in non-interactive CI
Key-based signing is well-supported and reliable

Security Considerations¶

Private key protection: Stored in Azure DevOps Secure Files (encrypted at rest)
Password protection: Private key is encrypted, password in secret variable
Timing-safe verification: Public key verification uses constant-time comparison
Key rotation: Documented procedure for rotating keys periodically (see Cosign Setup Guide)

Pipeline Parameters¶

Parameter	Type	Default	Description
`enableSigning`	boolean	`true`	Enable/disable image signing and attestations

Verification Commands¶

# Verify image signature
cosign verify --key cosign.pub \
  registry.digitalocean.com/forma-3d/forma3d-connect-api:20260114160000

# View attestations attached to image
cosign tree registry.digitalocean.com/forma-3d/forma3d-connect-api:20260114160000

# Verify and decode attestation
cosign verify-attestation --key cosign.pub --type custom \
  registry.digitalocean.com/forma-3d/forma3d-connect-api@sha256:... \
  | jq '.payload | @base64d | fromjson | .predicate'

Local Tooling¶

A script is provided to view image promotion status:

# List all images with their promotion status
./scripts/list-image-promotions.sh

# Output shows signed status and promotion level
  TAG                PROMOTION    SIGNED   UPDATED
  20260114160000     STAGING      ✓        2026-01-14
  20260114120000     none         ✓        2026-01-14

Consequences¶

✅ Cryptographic proof of image provenance
✅ Tamper detection for container images
✅ Audit trail for environment promotions
✅ Works reliably in Azure DevOps without OIDC setup
✅ Can verify images locally with public key
⚠️ Requires secure key management
⚠️ Keys must be rotated periodically (recommended: 6-12 months)
⚠️ Pipeline requires secure files and variables to be configured

Alternatives Considered¶

Alternative	Reason for Rejection
No signing	No supply chain security, no tamper detection
Keyless signing (OIDC)	Falls back to device flow in Azure DevOps, requires manual auth
Docker Content Trust (DCT)	Less flexible, no custom attestations, vendor lock-in
Image tags for promotion	Tags can be overwritten, no cryptographic verification
External attestation store	Additional infrastructure, attestations separate from images

Cosign Setup Guide - Step-by-step key generation and Azure DevOps configuration
Sigstore Documentation - Official cosign documentation
Container Image Promotion - Usage instructions for promotion scripts

ADR-026: CycloneDX SBOM Attestations¶

Attribute	Value
ID	ADR-026
Status	✅ Implemented
Date	2026-01-16
Context	Need to generate and attach Software Bill of Materials (SBOM) to container images for supply chain transparency

Decision¶

Generate CycloneDX SBOMs using Syft and attach them as signed attestations using cosign.

Architecture¶

Each container image in the registry will have multiple attestations stored as separate OCI artifacts:

Container Image (e.g., forma3d-connect-api:20260116120000)
├── Image signature (.sig) ─────────────── cosign sign
├── SBOM attestation (.att) ────────────── cosign attest --type cyclonedx
├── Staging promotion attestation (.att) ─ cosign attest --type custom
└── Production promotion attestation (.att) cosign attest --type custom

Why CycloneDX over SPDX¶

Criteria	CycloneDX	SPDX
Primary Focus	Security & DevSecOps	License compliance
VEX Support	Native	Separate spec
Tool Ecosystem	Excellent (Grype, Syft)	Good
Format Complexity	Simpler	More complex
OWASP Alignment	Yes (OWASP project)	No

CycloneDX was chosen because:

Better integration with vulnerability scanners (Grype, Trivy)
Native support for VEX (Vulnerability Exploitability eXchange)
Simpler format for debugging
Aligns with OWASP security practices
Growing adoption in DevSecOps pipelines

Implementation¶

Pipeline Step (after image signing):

- script: |
    set -e

    # Install Syft
    curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin

    # Generate CycloneDX SBOM
    syft $(dockerRegistry)/$(imageName)@$(digest) \
      --output cyclonedx-json=sbom.cdx.json

    # Attach as signed attestation
    cosign attest \
      --yes \
      --key $(cosignKey.secureFilePath) \
      --predicate sbom.cdx.json \
      --type cyclonedx \
      $(dockerRegistry)/$(imageName)@$(digest)
  displayName: 'Generate and Attach SBOM'
  env:
    COSIGN_PASSWORD: $(COSIGN_PASSWORD)

Attestation Types in Registry¶

After deployment, each image has multiple separate attestations:

Attestation Type	Purpose	Created By
Signature	Proves image was built by CI/CD	`cosign sign`
CycloneDX SBOM	Lists all components/packages	`cosign attest --type cyclonedx`
Staging	Proves image passed staging	`cosign attest --type custom`
Production	Proves image deployed to prod	`cosign attest --type custom`

Verification Commands¶

# View all attestations attached to an image
cosign tree registry.digitalocean.com/forma-3d/forma3d-connect-api:latest

# Verify and extract SBOM
cosign verify-attestation \
  --key cosign.pub \
  --type cyclonedx \
  registry.digitalocean.com/forma-3d/forma3d-connect-api@sha256:... \
  | jq -r '.payload' | base64 -d | jq '.predicate'

# Count components in SBOM
cosign verify-attestation --key cosign.pub --type cyclonedx \
  registry.digitalocean.com/forma-3d/forma3d-connect-api@sha256:... \
  | jq -r '.payload' | base64 -d | jq '.predicate.components | length'

Scanning for Vulnerabilities¶

With the SBOM attached, you can scan for vulnerabilities without pulling the full image:

# Extract SBOM and scan with Grype
cosign verify-attestation --key cosign.pub --type cyclonedx \
  registry.digitalocean.com/forma-3d/forma3d-connect-api@sha256:... \
  | jq -r '.payload' | base64 -d | jq '.predicate' > sbom.cdx.json

grype sbom:sbom.cdx.json

Rationale¶

Supply chain transparency: SBOM provides complete visibility into image contents
Vulnerability management: Enables scanning without pulling full images
Compliance: Meets requirements for software transparency (US Executive Order 14028)
Signed attestation: SBOM itself is cryptographically signed, preventing tampering
Tool-agnostic: CycloneDX is an open standard supported by many tools

Consequences¶

✅ Complete visibility into image dependencies
✅ Enables vulnerability scanning from SBOM
✅ Signed attestation prevents SBOM tampering
✅ Supports compliance requirements
✅ Works with existing cosign infrastructure
⚠️ Adds ~10-15 seconds to pipeline per image
⚠️ SBOM attestation adds ~2KB manifest to registry

Alternatives Considered¶

Alternative	Reason for Rejection
SPDX format	More focused on licensing, less security tooling
Syft native format	Not an industry standard, limited tool support
Docker Buildx --sbom	Requires buildx, less control over format
No SBOM	Missing supply chain transparency
SBOM in image labels	Not cryptographically signed, can be tampered

Tools Used¶

Tool	License	Purpose
Syft	Apache 2.0	Generate CycloneDX SBOM
Cosign	Apache 2.0	Sign and attach as attestation
Grype	Apache 2.0	Vulnerability scanning (optional)

ADR-027: TanStack Query for Server State Management¶

Attribute	Value
ID	ADR-026
Status	Accepted
Date	2026-01-14
Context	Need to manage server state in the React dashboard with caching, refetching, and loading states

Decision¶

Use TanStack Query (v5.x, formerly React Query) for server state management in the dashboard.

Rationale¶

Automatic caching: Query results are cached and deduplicated automatically
Background refetching: Data stays fresh with configurable stale times and refetch intervals
Loading/error states: Built-in loading, error, and success states reduce boilerplate
Optimistic updates: Supports optimistic updates for better UX on mutations
DevTools: React Query DevTools for debugging cache state
TypeScript support: Excellent TypeScript integration with inferred types

Implementation¶

// Query client configuration (apps/web/src/lib/query-client.ts)
const queryClient = new QueryClient({
  defaultOptions: {
    queries: {
      staleTime: 30 * 1000, // 30 seconds
      gcTime: 5 * 60 * 1000, // 5 minutes cache
      retry: 1,
      refetchOnWindowFocus: false,
    },
  },
});

// Example hook (apps/web/src/hooks/use-orders.ts)
export function useOrders(query: OrdersQuery = {}) {
  return useQuery({
    queryKey: ['orders', query],
    queryFn: () => apiClient.orders.list(query),
  });
}

Consequences¶

✅ Eliminates manual loading/error state management
✅ Automatic cache invalidation on mutations
✅ Integrates well with Socket.IO for real-time updates
✅ Reduces API calls through intelligent caching
⚠️ Requires understanding of query keys for proper cache invalidation

Alternatives Considered¶

Alternative	Reason for Rejection
Redux	Too much boilerplate for server state
SWR	Less features than TanStack Query
Apollo Client	GraphQL-focused, overkill for REST API
Manual fetch	Requires implementing caching/loading states manually

ADR-028: Socket.IO for Real-Time Dashboard Updates¶

Attribute	Value
ID	ADR-028
Status	Accepted
Date	2026-01-14
Context	Dashboard needs real-time updates when orders and print jobs change status

Decision¶

Use Socket.IO for real-time WebSocket communication between backend and dashboard.

Architecture¶

Backend Events          WebSocket Gateway         React Dashboard
     │                        │                        │
     │  order.created         │                        │
     ├───────────────────────►│                        │
     │                        │  order:created         │
     │                        ├───────────────────────►│
     │                        │                        │ invalidateQueries()
     │                        │                        │ toast.success()

Implementation¶

Backend (NestJS WebSocket Gateway):

// apps/api/src/gateway/events.gateway.ts
@WebSocketGateway({ namespace: '/events' })
export class EventsGateway {
  @WebSocketServer()
  server!: Server;

  @OnEvent(ORDER_EVENTS.CREATED)
  handleOrderCreated(event: OrderEventPayload): void {
    this.server.emit('order:created', { ... });
  }
}

Frontend (React Context):

// apps/web/src/contexts/socket-context.tsx
socketInstance.on('order:created', (data) => {
  toast.success(`New order: #${data.orderNumber}`);
  queryClient.invalidateQueries({ queryKey: ['orders'] });
});

Rationale¶

Already installed: Socket.IO server was already in dependencies for Phase 3
Bidirectional: Supports future features like notifications and chat
Automatic reconnection: Handles network interruptions gracefully
Namespace support: Can separate different event channels
Browser compatibility: Works across all modern browsers

Consequences¶

✅ Real-time updates without polling
✅ Toast notifications on important events
✅ Automatic TanStack Query cache invalidation
✅ Connection status visible in UI
⚠️ Requires WebSocket support in infrastructure

Alternatives Considered¶

Alternative	Reason for Rejection
Polling	Higher latency, more server load
Server-Sent Events	One-directional only
Raw WebSockets	Less features than Socket.IO (rooms, reconnection)
Pusher/Ably	External dependency, cost

ADR-029: API Key Authentication for Dashboard¶

Attribute	Value
ID	ADR-029
Status	Accepted
Date	2026-01-14
Context	Dashboard needs authentication to protect admin operations

Decision¶

Use API key authentication stored in browser localStorage for dashboard authentication.

Implementation¶

// apps/web/src/contexts/auth-context.tsx
const AUTH_STORAGE_KEY = 'forma3d_api_key';

export function AuthProvider({ children }: { children: ReactNode }) {
  const [apiKey, setApiKey] = useState<string | null>(() => {
    return localStorage.getItem(AUTH_STORAGE_KEY);
  });

  const login = (key: string) => {
    localStorage.setItem(AUTH_STORAGE_KEY, key);
    setApiKey(key);
  };
  // ...
}

// Protected routes redirect to /login if not authenticated
function ProtectedRoute({ children }: { children: ReactNode }) {
  const { isAuthenticated } = useAuth();
  if (!isAuthenticated) return <Navigate to="/login" replace />;
  return <>{children}</>;
}

Rationale¶

Simplicity: No session management, token refresh, or OAuth complexity
Consistent with API: Uses same API key authentication as backend (ADR-024)
Offline-capable: Works without server validation on page load
Single operator: System is used by single operator, not public users

Security Considerations¶

API key stored in localStorage (acceptable for internal admin tool)
Key sent via X-API-Key header for mutations
HTTPS required in production
Key should be rotated periodically

Consequences¶

✅ Simple implementation and user experience
✅ Consistent with existing API key guard on backend
✅ No additional authentication infrastructure needed
⚠️ API key visible in localStorage (acceptable for admin tool)
⚠️ No role-based access control (single admin role)

Alternatives Considered¶

Alternative	Reason for Rejection
OAuth/OIDC	Overkill for single-operator system
JWT tokens	Adds complexity without benefit for this use case
Session cookies	Requires server-side session management
No auth	Admin operations must be protected

ADR-030: Sendcloud for Shipping Integration¶

Attribute	Value
ID	ADR-030
Status	Accepted
Date	2026-01-16
Context	Need to generate shipping labels and sync tracking information to Shopify

Decision¶

Use Sendcloud API (custom integration) rather than the native Sendcloud-Shopify app for shipping label generation and tracking.

Rationale¶

Why Sendcloud as a Platform¶

Multi-carrier support: Single API for PostNL, DPD, DHL, UPS, and 80+ other carriers
European focus: Strong presence in Belgium/Netherlands matching Forma3D's primary market
Simple API: REST API with Basic Auth, parcel creation returns label PDF immediately
Automatic tracking: Tracking numbers and URLs provided on parcel creation
Webhook support: Status updates available via webhooks (for future enhancement)
Competitive pricing: Pay-per-label pricing suitable for small business volumes
Label formats: Supports A4, A6, and thermal printer formats

Why Custom API Integration vs Native Shopify-Sendcloud App¶

Sendcloud offers a native Shopify integration that automatically syncs orders. However, we chose a custom API integration for the following reasons:

Aspect	Native Sendcloud-Shopify App	Our Custom API Integration
Trigger	Manual — operator must create label in Sendcloud dashboard	Automatic — triggered when all print jobs complete
Print awareness	None — doesn't know about 3D printing workflow	Full — waits for SimplyPrint jobs to finish
Unified dashboard	Split across Shopify + Sendcloud panels	Single dashboard — orders, prints, shipments in one place
Audit trail	Separate logs in each system	Integrated event log with full traceability
Custom workflow	Generic e-commerce flow	Custom print-to-ship automation
Tracking sync timing	After manual label creation	Immediate — included in Shopify fulfillment

Key insight: The native integration doesn't know when 3D printing is complete. An operator would need to: 1. Monitor SimplyPrint for job completion 2. Switch to Sendcloud dashboard 3. Find the order and create a label 4. Wait for tracking to sync back to Shopify

Our custom integration automates this entire workflow:

Print Jobs Complete → Auto-Generate Label → Auto-Fulfill with Tracking → Customer Notified

This reduces manual intervention from ~5 minutes per order to zero, which is critical for scaling order volumes.

Implementation¶

apps/api/src/
├── sendcloud/
│   ├── sendcloud-api.client.ts    # HTTP client with Basic Auth
│   ├── sendcloud.service.ts       # Business logic, event listener
│   ├── sendcloud.controller.ts    # REST endpoints
│   └── sendcloud.module.ts
├── shipments/
│   ├── shipments.repository.ts    # Prisma queries for Shipment
│   ├── shipments.controller.ts    # REST endpoints
│   └── shipments.module.ts

libs/api-client/src/
└── sendcloud/
    └── sendcloud.types.ts         # Typed DTOs for Sendcloud API

Event Flow¶

All print jobs complete → OrchestrationService emits order.ready-for-fulfillment
SendcloudService listens → creates parcel via Sendcloud API
Sendcloud returns label URL + tracking number
Shipment record stored in database
SendcloudService emits shipment.created event
FulfillmentService listens → creates Shopify fulfillment with tracking info
Customer receives email notification with tracking link

┌─────────────┐    ┌──────────────┐    ┌─────────────┐    ┌─────────────┐
│ SimplyPrint │───▶│ Orchestration│───▶│  Sendcloud  │───▶│ Fulfillment │
│  (prints)   │    │   Service    │    │   Service   │    │   Service   │
└─────────────┘    └──────────────┘    └─────────────┘    └─────────────┘
                         │                    │                  │
                         │ order.ready-       │ shipment.        │ Shopify
                         │ for-fulfillment    │ created          │ Fulfillment
                         ▼                    ▼                  ▼
                   [All jobs done]      [Label + tracking]  [Customer notified]

Consequences¶

✅ Single integration for multiple carriers
✅ Automatic label PDF generation
✅ Tracking information synced to Shopify fulfillments
✅ Dashboard displays shipment status and label download
⚠️ Dependent on Sendcloud uptime and API availability
⚠️ Limited to carriers supported by Sendcloud
⚠️ Requires Sendcloud account and sender address configuration

Environment Variables¶

SENDCLOUD_PUBLIC_KEY=xxx
SENDCLOUD_SECRET_KEY=xxx
SENDCLOUD_API_URL=https://panel.sendcloud.sc/api/v2
DEFAULT_SHIPPING_METHOD_ID=8
DEFAULT_SENDER_ADDRESS_ID=12345
SHIPPING_ENABLED=true

Alternatives Considered¶

Alternative	Reason for Rejection
Native Sendcloud-Shopify app	Requires manual label creation; no print workflow awareness
Direct carrier APIs	Too many integrations to maintain, each with different APIs
ShipStation	US-focused, less European carrier support
EasyPost	Less European carrier coverage than Sendcloud
Manual labels	Does not meet automation requirements; ~5 min overhead per order

ADR-031: Automated Container Registry Cleanup¶

Attribute	Value
ID	ADR-031
Status	Accepted
Date	2026-01-16
Context	Container registries accumulate old images over time, increasing storage costs and clutter

Decision¶

Implement automated container registry cleanup that runs after each successful staging deployment and attestation. The cleanup uses attestation-based policies to determine which images to keep or delete.

Rationale¶

The Problem¶

Without automated cleanup, the DigitalOcean Container Registry accumulates images indefinitely:

Each CI build creates new images with timestamped tags (e.g., 20260116120000)
Signature and attestation artifacts add ~2KB per image
Storage costs grow linearly with deployment frequency
Old images provide no value after newer versions are verified in production

Attestation-Based Cleanup Policy¶

The cleanup leverages the cosign attestation system (ADR-025) to make intelligent retention decisions:

Image Status	Action	Rationale
PRODUCTION attestation	Keep	May need for rollback
Currently deployed	Keep	Active in production/staging
Recent (last 5)	Keep	Recent builds for debugging
STAGING-only attestation	Delete	Superseded by newer staging builds
No attestation	Delete	Never passed acceptance tests

This policy ensures: 1. Rollback capability: Production-attested images are always available 2. Debugging support: Recent images preserved for investigation 3. Automatic garbage collection: Old staging/unsigned images removed

Integration with Health Endpoints¶

The cleanup script queries the /health endpoints to determine which images are currently deployed:

# API health endpoint returns current build number
curl https://staging-connect-api.forma3d.be/health
# Response: { "build": { "number": "20260116120000" }, ... }

This prevents accidental deletion of running containers.

Implementation¶

scripts/
└── cleanup-registry.sh    # Cleanup script with attestation checking

azure-pipelines.yml
└── AcceptanceTest stage
    └── CleanupRegistry job    # Runs after AttestStagingPromotion

Cleanup Script¶

The scripts/cleanup-registry.sh script:

Authenticates to DigitalOcean Container Registry via doctl
Queries health endpoints to find currently deployed image tags
Lists all images in the registry for each repository
Checks attestations using cosign verify-attestation with the public key
Applies retention policy based on attestation status
Deletes eligible images via doctl registry repository delete-manifest
Triggers garbage collection to reclaim storage space

Pipeline Integration¶

The cleanup runs as the final job in the AcceptanceTest stage:

- job: CleanupRegistry
  displayName: 'Cleanup Container Registry'
  dependsOn: AttestStagingPromotion
  condition: or(succeeded(), eq(variables.deploymentHappened, 'False'))

Cleanup Flow¶

┌─────────────────────────────────────────────────────────────────────┐
│                        AcceptanceTest Stage                          │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  VerifyDeployment ──▶ RunAcceptanceTests ──▶ AttestStagingPromotion │
│                                                     │                │
│                                                     ▼                │
│                                            CleanupRegistry           │
│                                                     │                │
│                                                     ▼                │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │ 1. Query /health endpoints for deployed versions             │   │
│  │ 2. List all images in registry                               │   │
│  │ 3. For each image:                                           │   │
│  │    - Check if PRODUCTION attested → KEEP                     │   │
│  │    - Check if currently deployed → KEEP                      │   │
│  │    - Check if in top 5 recent → KEEP                         │   │
│  │    - Check if STAGING-only attested → DELETE                 │   │
│  │    - Check if no attestation → DELETE                        │   │
│  │ 4. Trigger garbage collection                                │   │
│  └──────────────────────────────────────────────────────────────┘   │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

Usage¶

Local Testing (Dry Run)¶

# Preview what would be deleted
./scripts/cleanup-registry.sh \
  --key cosign.pub \
  --api-url https://staging-connect-api.forma3d.be \
  --web-url https://staging-connect.forma3d.be \
  --dry-run \
  --verbose

Manual Cleanup¶

# Perform actual cleanup
./scripts/cleanup-registry.sh \
  --key cosign.pub \
  --api-url https://staging-connect-api.forma3d.be \
  --web-url https://staging-connect.forma3d.be \
  --verbose

Script Options¶

Option	Description
`-k, --key FILE`	Public key for attestation verification (required)
`--api-url URL`	API health endpoint URL (required)
`--web-url URL`	Web health endpoint URL (required)
`--keep-recent N`	Keep N most recent images (default: 5)
`--dry-run`	Preview deletions without executing
`-v, --verbose`	Show detailed output

Consequences¶

✅ Automatic storage management reduces costs
✅ Attestation-based policy ensures production rollback capability
✅ Health endpoint check prevents deletion of running containers
✅ Dry-run mode enables safe testing
✅ Garbage collection reclaims space after deletion
⚠️ Requires health endpoints to return build information
⚠️ Dependent on cosign/doctl availability in pipeline

Alternatives Considered¶

Alternative	Reason for Rejection
Time-based retention (e.g., 30 days)	Doesn't account for promotion status; may delete production-ready images
Tag-based retention (e.g., keep `latest`)	`latest` tag is mutable; doesn't guarantee correct image
Manual cleanup	Error-prone, inconsistent, doesn't scale
Registry auto-purge policies	DigitalOcean doesn't support attestation-aware policies

ADR-032: Domain Boundary Separation with Interface Contracts¶

Attribute	Value
ID	ADR-032
Title	Domain Boundary Separation with Interface Contracts
Status	Implemented
Context	Prepare the modular monolith for potential future microservices extraction by establishing clean domain boundaries
Date	2026-01-17

Context¶

As the application grows, we need to ensure domain boundaries are well-defined to: 1. Enable future microservices extraction without major refactoring 2. Reduce coupling between modules 3. Enable independent testing of domain logic 4. Provide distributed tracing capabilities

Decision¶

We implement domain boundary separation with the following patterns:

1. Domain Contracts Library (`libs/domain-contracts`)¶

Create a dedicated library containing: - Interface definitions (IOrdersService, IPrintJobsService, etc.) - DTOs for cross-domain communication (OrderDto, PrintJobDto, etc.) - Symbol injection tokens (ORDERS_SERVICE, PRINT_JOBS_SERVICE, etc.)

2. Correlation ID Infrastructure¶

Add correlation ID propagation for distributed tracing: - CorrelationMiddleware extracts/generates x-correlation-id headers - CorrelationService uses AsyncLocalStorage for context propagation - All domain events include correlationId, timestamp, and source fields

3. Repository Encapsulation¶

Repositories are internal implementation details: - Modules stop exporting repositories - Only interface tokens are exported for cross-domain communication - Services implement domain interfaces

4. Event-Based Base Interfaces¶

Define base event interfaces that all domain events extend:

interface BaseEvent {
  correlationId: string;
  timestamp: Date;
  source: string;
}

Implementation¶

Component	Path	Description
Domain Contracts	`libs/domain-contracts/`	Interface definitions and DTOs
Correlation Service	`apps/api/src/common/correlation/`	Request context propagation
Base Events	`libs/domain/src/events/`	Base event interfaces

Interface Tokens Pattern¶

// In domain-contracts library
export const ORDERS_SERVICE = Symbol('IOrdersService');

export interface IOrdersService {
  findById(id: string): Promise<OrderDto | null>;
  updateStatus(id: string, status: OrderStatus): Promise<OrderDto>;
  // ... other methods
}

// In module
@Module({
  providers: [
    OrdersService,
    { provide: ORDERS_SERVICE, useExisting: OrdersService },
  ],
  exports: [ORDERS_SERVICE], // No longer exports repository
})
export class OrdersModule {}

// In consumer service
@Injectable()
export class FulfillmentService {
  constructor(
    @Inject(ORDERS_SERVICE)
    private readonly ordersService: IOrdersService,
  ) {}
}

Consequences¶

Positive: - Clear domain boundaries enable future microservices extraction - Reduced coupling between modules - Better testability with interface-based mocking - Distributed tracing via correlation IDs - Repository details are now private implementation

Negative: - Slight increase in boilerplate (interface definitions, DTOs) - Need to maintain DTO mapping logic - Some forwardRef() usages remain for circular retry patterns

ADR-007: Layered Architecture with Repository Pattern
ADR-008: Event-Driven Internal Communication
ADR-013: Shared Domain Library

ADR-033: Database-Backed Webhook Idempotency¶

Attribute	Value
ID	ADR-033
Title	Database-Backed Webhook Idempotency
Status	Implemented
Context	In-memory webhook idempotency cache doesn't work in multi-instance deployments
Date	2026-01-17

Context¶

The original implementation used an in-memory Set<string> for webhook idempotency tracking:

private readonly processedWebhooks = new Set<string>();

This approach had critical problems:

Horizontal Scaling Failure: In a multi-instance deployment, each API instance has its own cache. Webhooks may be processed multiple times across instances.
Memory Leak: The Set grows unbounded as webhooks are processed, causing memory pressure in long-running instances.
Restart Data Loss: All idempotency data is lost on application restart, allowing duplicate processing during restarts.

Decision¶

Use a PostgreSQL table (ProcessedWebhook) for webhook idempotency instead of Redis or in-memory caching.

Rationale¶

No additional infrastructure: Uses existing PostgreSQL database
Transactional safety: Database unique constraint ensures race-condition-safe idempotency
Simple cleanup: Scheduled job removes expired records hourly
Debugging support: Records include metadata (webhook type, order ID, timestamps)
Horizontal scaling: Works correctly across multiple API instances

Implementation¶

// Atomic check-and-mark using unique constraint
async isProcessedOrMark(webhookId: string, type: string): Promise<boolean> {
  try {
    await this.prisma.processedWebhook.create({ 
      data: { webhookId, webhookType: type, expiresAt } 
    });
    return false; // First time processing
  } catch (error) {
    if (error.code === 'P2002') return true; // Already processed
    throw error;
  }
}

Database Schema¶

model ProcessedWebhook {
  id          String   @id @default(uuid())
  webhookId   String   @unique  // The Shopify webhook ID
  webhookType String            // e.g., "orders/create"
  processedAt DateTime @default(now())
  expiresAt   DateTime          // When this record can be cleaned up
  orderId     String?           // Associated order for debugging

  @@index([expiresAt])          // For cleanup job queries
  @@index([processedAt])        // For monitoring
}

Alternatives Considered¶

Alternative	Pros	Cons	Decision
Redis	TTL support, fast	Additional infrastructure	Rejected
Distributed Lock	Works with DB	Complex, race conditions	Rejected
Database Table	Simple, no new infra	Needs cleanup job	Selected

Consequences¶

Positive:

✅ Works correctly in multi-instance deployments
✅ Survives application restarts
✅ No memory leaks
✅ Auditable (can query processed webhooks)
✅ Race-condition safe via unique constraint

Negative:

⚠️ Slightly higher latency than in-memory (< 10ms)
⚠️ Requires cleanup job (runs hourly)

ADR-007: Layered Architecture with Repository Pattern
ADR-021: Retry Queue for Resilient Operations

ADR-034: Docker Infrastructure Hardening (Log Rotation & Resource Cleanup)¶

Status	Date	Context
Accepted	2026-01-19	Prevent disk exhaustion from Docker logs and images

Context¶

During staging operations, the server disk filled to 100% due to:

Unbounded Docker logs: The default json-file log driver has no size limits, causing container logs to grow indefinitely
Accumulated old images: Each deployment pulls new images but old versions remained on disk
Health check failures: When disk was full, Docker couldn't execute health checks, causing containers to be marked unhealthy and Traefik to stop routing traffic

Decision¶

Implement automated infrastructure hardening in the deployment pipeline:

Docker Log Rotation: Configure daemon-level log rotation with size limits
Aggressive Resource Cleanup: Remove unused images, volumes, and networks after each deployment
Separate Image Tags: Use independent version tags for API and Web to support partial deployments

Implementation¶

1. Docker Log Rotation Configuration¶

The pipeline automatically creates /etc/docker/daemon.json if missing:

{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m",
    "max-file": "3"
  }
}

This limits each container to: - Maximum 10MB per log file - Maximum 3 rotated files - Total: 30MB per container (90MB for all 3 containers)

2. Deployment Cleanup Steps¶

After container restart, the pipeline runs:

# Remove dangling images
docker image prune -f

# Remove unused images older than 24h
docker image prune -a -f --filter "until=24h"

# Clean up unused volumes and networks
docker volume prune -f
docker network prune -f

3. Separate Image Tags¶

docker-compose.yml now uses independent tags:

api:
  image: ${REGISTRY_URL}/forma3d-connect-api:${API_IMAGE_TAG:-latest}

web:
  image: ${REGISTRY_URL}/forma3d-connect-web:${WEB_IMAGE_TAG:-latest}

This allows: - Deploying only API without changing Web version - Deploying only Web without changing API version - Independent rollbacks for each service

Consequences¶

Positive:

✅ Prevents disk exhaustion from unbounded log growth
✅ Reduces disk usage by cleaning old images after deployment
✅ Supports independent versioning for API and Web
✅ Self-healing: Pipeline automatically configures log rotation if missing
✅ No manual intervention required

Negative:

⚠️ Docker daemon restart required if log rotation config is missing (brief container interruption)
⚠️ Log history limited to ~30MB per container (may need external log aggregation for production)

Configuration Summary¶

Setting	Value	Rationale
`max-size`	10m	Balance between history and disk usage
`max-file`	3	Keeps ~30MB per container
Image cleanup filter	24h	Keeps recent images for quick rollback

ADR-017: Docker + Traefik Deployment Strategy
ADR-031: Automated Container Registry Cleanup

ADR-035: Progressive Web App (PWA) for Cross-Platform Access¶

Attribute	Value
ID	ADR-035
Status	Accepted
Date	2026-01-19
Context	Need to provide mobile and desktop access for operators monitoring print jobs and managing orders while away from desk

Decision¶

Adopt Progressive Web App (PWA) technology for the existing React web application, replacing the planned Tauri (desktop) and Capacitor (mobile) native shell applications.

The web application will be enhanced with: 1. Web App Manifest for installability 2. Service Worker for offline caching and push notifications 3. Web Push API for real-time alerts on print job status

Rationale¶

PWA Suitability for Admin Dashboards¶

Research conducted in January 2026 confirms PWA is an ideal fit for Forma3D.Connect:

Application type: Admin dashboards and SaaS tools are PWA's primary use case
Feature requirements: Order management, real-time updates, and push notifications are fully supported
Device features: No deep hardware integration (Bluetooth, NFC, sensors) required

iOS/Safari PWA Support (2026)¶

Apple has significantly improved PWA support:

Feature	iOS Version	Status
Web Push Notifications	iOS 16.4+	✅ Supported (Home Screen install required)
Badging API	iOS 16.4+	✅ Supported
Declarative Web Push	iOS 18.4+	✅ Improved reliability
Standalone Display Mode	iOS 16.4+	✅ Supported

Cost-Benefit Analysis¶

Aspect	Tauri + Capacitor	PWA
Initial development	40-80 hours	8-16 hours
CI/CD pipelines	Additional complexity	None
Code signing	Required (Apple, Windows)	None
App store submissions	Required	None
Update cycle	Days (app store review)	Instant
Maintenance	Ongoing	Minimal

Estimated savings: 80-150 hours initial + ongoing maintenance reduction

Tauri/Capacitor Provided No Real Advantage¶

Both planned native apps were WebView wrappers: - Container(desktop, "Tauri, Rust", "Native desktop shell wrapping the web application") - Container(mobile, "Capacitor", "Mobile shell for on-the-go monitoring")

PWA provides the same experience (installable, app-like, offline capable) without: - Separate build pipelines - Platform-specific debugging - App store management - Code signing certificates

Implementation¶

Phase 1: PWA Foundation¶

Add vite-plugin-pwa to the web application
Create manifest.json with app metadata and icons
Configure service worker for asset caching
Enable HTTPS (already implemented)

{
  "name": "Forma3D.Connect",
  "short_name": "Forma3D",
  "start_url": "/",
  "display": "standalone",
  "background_color": "#ffffff",
  "theme_color": "#0066cc"
}

Phase 2: Push Notifications¶

Implement Web Push API in frontend
Add VAPID key configuration to API
Create notification service (integrate with existing email notifications)
User permission flow in dashboard settings

Phase 3: Enhanced Offline Support¶

IndexedDB for offline data caching
Background sync for queued actions
Optimistic UI updates

Consequences¶

Positive:

✅ Significant reduction in development and maintenance effort
✅ Single codebase, single deployment target
✅ Instant updates for all users (no app store delays)
✅ No platform-specific bugs or WebView inconsistencies
✅ No code signing or app store management
✅ Works on any device with a modern browser

Negative:

⚠️ iOS requires Home Screen install for full PWA features
⚠️ No notification sounds on iOS PWA (visual only)
⚠️ Limited system tray integration on desktop

Removed from Project:

❌ apps/desktop (Tauri) - removed from roadmap
❌ apps/mobile (Capacitor) - removed from roadmap

Updated Architecture¶

The C4 Container diagram has been updated to reflect the PWA-only architecture:

Before:
├── Web Application (React 19)
├── Desktop App (Tauri) [future]
├── Mobile App (Capacitor) [future]
└── API Server (NestJS)

After:
├── Progressive Web App (React 19 + PWA)
└── API Server (NestJS)

Alternatives Considered¶

Alternative	Reason for Rejection
Keep Tauri + Capacitor plan	Unnecessary complexity; WebView wrappers provide no advantage over PWA
React Native for mobile	Requires separate codebase; overkill for admin dashboard
Electron for desktop	Large bundle size; same WebView approach as Tauri but less efficient
Flutter	Requires separate codebase; not justified for simple dashboard

PWA Feasibility Study - Detailed research and analysis
C4 Container Diagram - Updated architecture diagram

ADR-036: localStorage Fallback for PWA Install Detection¶

Attribute	Value
ID	ADR-036
Status	Accepted
Date	2026-01-20
Context	Need to detect if PWA is installed when user views site in browser, to show appropriate messaging and avoid duplicate install prompts

Decision¶

Use a dual detection strategy combining the getInstalledRelatedApps() API with localStorage persistence as a fallback for PWA installation detection.

Rationale¶

The Problem¶

When a user installs a PWA and later visits the same site in a regular browser: - The browser doesn't know the PWA is installed - The site shows "Install App" even though it's already installed - This creates a confusing user experience

API Limitations¶

The navigator.getInstalledRelatedApps() API can detect installed PWAs, but has limitations:

Platform	Chrome Version	Support
Android	80+	✅ Full support
Windows	85+	✅ Supported
macOS	140+	✅ Same-scope only
iOS/Safari	-	❌ Not supported

Even where supported, the API can be unreliable due to: - Scope restrictions (must be same origin/scope) - Timing issues during page load - Browser implementation quirks

Dual Detection Strategy¶

Primary: getInstalledRelatedApps() API
Query the browser for installed related apps
Works when supported and correctly configured
Fallback: localStorage persistence
Store pwa-installed: true when:
- User installs via appinstalled event
- App is opened in standalone mode
- API successfully detects installation
Check localStorage on page load

Implementation¶

// Detection flow
useEffect(() => {
  // 1. Check standalone mode (running inside PWA)
  const isStandalone = window.matchMedia('(display-mode: standalone)').matches;
  if (isStandalone) {
    setIsInstalled(true);
    localStorage.setItem('pwa-installed', 'true');
    return;
  }

  // 2. Check localStorage fallback
  if (localStorage.getItem('pwa-installed') === 'true') {
    setIsInstalled(true);
  }

  // 3. Try getInstalledRelatedApps API
  if (navigator.getInstalledRelatedApps) {
    navigator.getInstalledRelatedApps().then(apps => {
      if (apps.some(app => app.platform === 'webapp')) {
        setIsInstalled(true);
        localStorage.setItem('pwa-installed', 'true');
      }
    });
  }
}, []);

// Persist on install
window.addEventListener('appinstalled', () => {
  localStorage.setItem('pwa-installed', 'true');
});

Consequences¶

Positive:

✅ Works across all browsers and platforms
✅ Provides consistent UX when switching between PWA and browser
✅ No false "Install App" prompts when already installed
✅ Gracefully degrades when API not supported

Negative:

⚠️ localStorage can become stale if user uninstalls PWA externally
⚠️ No automatic cleanup mechanism for uninstalled apps
⚠️ Per-browser storage (installing in Chrome won't reflect in Firefox)

Trade-off Accepted:

The risk of showing "Installed" for an uninstalled app is acceptable because: - Users rarely uninstall and then want to reinstall immediately - Clearing site data will reset the state - Better UX than constantly prompting to install an already-installed app

Alternatives Considered¶

Alternative	Reason for Rejection
API only	Too unreliable; doesn't work on Safari/iOS
localStorage only	Misses installations from other sessions
Server-side tracking	Requires authentication; overcomplicated
Cookie-based	Cleared more frequently than localStorage

ADR-035: Progressive Web App (PWA)
apps/web/src/hooks/use-pwa-install.ts - Implementation

ADR-037: Keep a Changelog for Release Documentation¶

Attribute	Value
ID	ADR-037
Status	Accepted
Date	2026-01-20
Context	Need a standardized way to document changes between releases for developers, operators, and stakeholders

Decision¶

Adopt the Keep a Changelog format for documenting all notable changes to the project, combined with Semantic Versioning for version numbers.

Rationale¶

Why Keep a Changelog?¶

Human-readable: Written for humans, not machines - focuses on what matters to users
Standardized format: Well-known convention reduces cognitive load
Categorized changes: Clear sections (Added, Changed, Deprecated, Removed, Fixed, Security)
Release-oriented: Groups changes by version, making it easy to see what's in each release
Unreleased section: Accumulates changes before a release, making release notes easy

Why Semantic Versioning?¶

MAJOR.MINOR.PATCH format communicates impact:
MAJOR: Breaking changes
MINOR: New features (backward compatible)
PATCH: Bug fixes (backward compatible)
Industry standard, well understood by developers
Enables automated tooling and dependency management

Benefits for AI-Generated Codebase¶

This project is primarily AI-generated, making structured documentation critical:

Context for AI: Changelog provides history context for future AI sessions
Audit trail: Documents what was added/changed in each phase
Stakeholder communication: Non-technical stakeholders can understand progress
Debugging aid: When issues arise, changelog helps identify when changes were introduced

Implementation¶

File location: CHANGELOG.md in repository root

Format:

# Changelog
All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [0.7.0] - 2026-01-19
### Added
- Feature description

### Changed
- Change description

### Fixed
- Bug fix description

### Security
- Security fix description

Change categories (use only those that apply): - Added: New features - Changed: Changes to existing functionality - Deprecated: Features marked for removal - Removed: Features removed - Fixed: Bug fixes - Security: Vulnerability fixes

Guidelines¶

Update with every PR: Add changelog entry as part of the PR
Write for humans: Describe the user impact, not implementation details
Link to issues/PRs: Reference related issues where helpful
Keep Unreleased current: Move entries to versioned section on release
One entry per change: Don't combine unrelated changes

Consequences¶

Positive:

✅ Clear release history for all stakeholders
✅ Standardized format reduces documentation overhead
✅ Supports both manual reading and automated parsing
✅ Integrates well with CI/CD release workflows
✅ Provides context for AI-assisted development sessions

Negative:

⚠️ Requires discipline to update with each change
⚠️ Can become verbose if too granular

Alternatives Considered¶

Alternative	Reason for Rejection
Git commit history only	Too granular; hard to see high-level changes
GitHub Releases only	Tied to GitHub; not in repository
Auto-generated from commits	Requires strict commit conventions; often too noisy
Wiki-based changelog	Separate from code; easy to forget to update

ADR-038: Zensical for Publishing Project Documentation¶

Attribute	Value
ID	ADR-038
Status	Accepted
Date	2026-01-21
Context	Need a maintainable, deployable documentation website built from the repository `docs/`

Decision¶

Publish the repository documentation in docs/ as a static website built with Zensical.

The docs site is:

Built from docs/ with configuration in zensical.toml
Rendered with PlantUML pre-rendering (SVG/PNG) for existing diagrams
Packaged as a container image forma3d-connect-docs and published to the existing container registry
Deployed to staging behind Traefik at https://staging-connect-docs.forma3d.be
Managed by the existing Azure DevOps pipeline using docsAffected detection

Rationale¶

Single source of truth: docs live next to the code they describe (docs/)
Static output: simple, fast, cacheable; no backend runtime required
Pipeline parity: follows the same build/sign/SBOM/deploy controls as api and web
Diagram support: preserves existing PlantUML investment via deterministic CI rendering

Implementation¶

Config: zensical.toml (sets site name, logo, PlantUML markdown extension)
Container build: deployment/docs/Dockerfile (builds site + serves via Nginx)
Staging service: deployment/staging/docker-compose.yml (docs service + Traefik labels)
CI/CD: azure-pipelines.yml
Detect changes to docs/** or zensical.toml via docsAffected
Build/push/sign/SBOM the forma3d-connect-docs image
Deploy conditionally to staging

Consequences¶

Positive:

✅ Documentation changes can be delivered independently of API/Web
✅ Consistent hosting model (Traefik + container) across services
✅ PlantUML diagrams render in the published docs site

Negative:

⚠️ Docs builds can be slower due to diagram rendering (mitigated by caching)
⚠️ Local preview requires Zensical + Java/Graphviz (documented in developer workflow)

Alternatives Considered¶

Alternative	Reason for Rejection
Host Markdown in repo UI	Not a branded, searchable documentation site
MkDocs Material	Zensical provides a modern, batteries-included path with similar ecosystem compatibility
Convert all diagrams to Mermaid	High migration effort; risk of losing diagram fidelity

docs/README.md - Documentation index
docs/05-deployment/staging-deployment-guide.md - Staging deployment guide
Zensical: Get started
Zensical: Logo and icons

Document History¶

Version	Date	Author	Changes
1.0	2026-01-10	AI Assistant	Initial ADR document with 13 decisions
1.1	2026-01-10	AI Assistant	Updated ADR-006 for Digital Ocean hosting, added ADR-014 for SimplyPrint
1.2	2026-01-10	AI Assistant	Added ADR-015 for Aikido Security Platform
1.3	2026-01-10	AI Assistant	Added ADR-016 for Sentry Observability with OpenTelemetry
1.4	2026-01-10	AI Assistant	Marked ADR-016 as implemented, added implementation details
1.5	2026-01-10	AI Assistant	Added ADR-017 for Docker + Traefik Deployment Strategy
1.6	2026-01-11	AI Assistant	Added ADR-018 for Nx Affected Conditional Deployment Strategy
1.7	2026-01-13	AI Assistant	Phase 2 updates: Updated ADR-008 with implemented events, added ADR-019 (SimplyPrint Webhook Verification), ADR-020 (Hybrid Status Monitoring)
1.8	2026-01-14	AI Assistant	Phase 3 updates: Added ADR-021 (Retry Queue), ADR-022 (Event-Driven Fulfillment), ADR-023 (Email Notifications)
1.9	2026-01-14	AI Assistant	Security update: Added ADR-024 (API Key Authentication for Admin Endpoints)
2.0	2026-01-14	AI Assistant	Supply chain security: Added ADR-025 (Cosign Image Signing)
2.1	2026-01-14	AI Assistant	Phase 4 updates: Added ADR-027 (TanStack Query), ADR-028 (Socket.IO Real-Time), ADR-029 (Dashboard Authentication)
2.2	2026-01-16	AI Assistant	SBOM attestations: Added ADR-026 (CycloneDX SBOM Attestations with Syft)
2.3	2026-01-16	AI Assistant	Phase 5 updates: Added ADR-030 (Sendcloud for Shipping Integration)
2.4	2026-01-16	AI Assistant	Registry cleanup: Added ADR-031 (Automated Container Registry Cleanup)
2.5	2026-01-17	AI Assistant	Domain boundary separation: Added ADR-032 (Domain Boundary Separation with Interface Contracts)
2.6	2026-01-17	AI Assistant	Critical tech debt resolution: Added ADR-033 (Database-Backed Webhook Idempotency)
2.7	2026-01-19	AI Assistant	Infrastructure hardening: Added ADR-034 (Docker Log Rotation & Resource Cleanup)
2.8	2026-01-19	AI Assistant	Cross-platform strategy: Added ADR-035 (PWA replaces Tauri/Capacitor native apps)
2.9	2026-01-20	AI Assistant	PWA detection: Added ADR-036 (localStorage Fallback for PWA Install Detection)
3.0	2026-01-20	AI Assistant	Documentation: Added ADR-037 (Keep a Changelog for Release Documentation)
3.1	2026-01-21	AI Assistant	Documentation: Added ADR-038 (Zensical for publishing project documentation from `docs/`)