RAVENCS
Financial Technology / Point of Sale

3 Java Microservices Migrated to AWS ECS with Automated CI/CD

A fintech company building a cloud-based POS system needed their monolithic deployment process replaced with containerized microservices on ECS — including solving ENI limits, port conflicts, and a production database crisis.

Services

Containerization & ECS MigrationDevOps PipelineManaged Cloud Operations

Key Result

3 services in production on ECS with sub-10-minute deployments

// challenge

The Challenge

A fintech company was building a cloud-based point-of-sale system that integrates with a major ERP platform. The application consisted of 3 Java microservices — an API Gateway (Spring Cloud Gateway), an Authentication Service (JWT-based), and a Transaction Processing Service — built on Java 17, Spring Boot 3.2, with a Maven multi-module project structure. They needed these services containerized and running on AWS with a reliable CI/CD pipeline.

The migration was not straightforward. The development team worked on Apple Silicon (M-series) machines, meaning every Docker build had to explicitly target linux/amd64 for AWS compatibility. The initial infrastructure plan used awsvpc networking on a t3.medium instance, but we hit ENI limits almost immediately — t3.medium supports only 3 elastic network interfaces, which meant the cluster could run exactly 3 tasks and no more. Rolling deployments were impossible because there was no room to launch a new task before draining the old one.

After the migration was stable, a production database incident tested the entire stack. A critical table had grown to 677,000 rows, and a missing composite index pushed the RDS instance to 100% CPU utilization. Separately, an external API integration began bombarding the Authentication Service with expired tokens at high volume, exhausting the connection pool and causing database deadlocks. Both incidents required root cause analysis and zero-downtime remediation under pressure.

// approach

Our Approach

  • Containerization: Dockerized all 3 services with multi-stage builds. Configured explicit --platform linux/amd64 targeting in both Dockerfiles and CI pipeline to solve the ARM64-to-AMD64 build mismatch from M-series development machines.
  • ECS architecture: Deployed on ECS EC2 (t3.2xlarge) with bridge mode networking instead of awsvpc. Bridge mode eliminated the ENI limit bottleneck entirely, allowing all services to share the host network interface. ALB routes traffic with HTTPS termination to the Spring Cloud Gateway on port 8085, which forwards to Authentication (8083) and Transaction Processing (8080). Backend is RDS MySQL, scaled from db.t3.xlarge to db.t3.2xlarge during the incidents.
  • Bridge mode deployment strategy: Bridge mode introduces port conflicts during deployments — you cannot run two copies of a service on the same port. We implemented a scale-down-to-zero, then scale-up deployment strategy per service, with ALB health checks gating traffic until the new task is healthy.
  • CI/CD pipeline: Bitbucket Pipelines triggers on merge to main. The pipeline builds the Docker image, pushes to ECR, registers a new ECS task definition revision, and updates the service. Code push to production in under 10 minutes.
  • Dedicated worker node: Spun up a separate c6i.2xlarge (compute-optimized) instance running an independent ECS service for background job processing — cron jobs and batch operations — isolated from the request-serving cluster to prevent resource contention.
  • Database incident resolution: Diagnosed the 677K-row table causing 100% CPU via slow query analysis. Added a composite index with zero downtime. For the connection pool exhaustion, traced the root cause to an external API flooding expired auth tokens, implemented request throttling, connection pool tuning, and token validation short-circuiting before the database round-trip.
  • Observability: CloudWatch alerting configured on CPU utilization, memory utilization, and ECS task count. Established a weekly operational review cadence covering deployment metrics, error rates, and infrastructure health.

// results

Results

3Microservices containerized and running on ECS
<10 minCode push to production deployment time
0Downtime during database incident remediation
677K rowsTable optimized with composite index
100% → <15%RDS CPU after index optimization
1Dedicated compute-optimized worker node
WeeklyOperational review cadence established

"Raven CS took our services from manual deployments to a fully automated pipeline in weeks. When we hit a database crisis in production, they diagnosed and fixed it without any downtime." — CTO, Fintech Company

// next step

Have a similar challenge?

We work with companies across Latam and the US. Tell us what you’re dealing with — no sales deck, no commitment.

30 min — camera optional — no commitment required