Dockerizing a Monorepo
Topic: Technology
I was doing a technical test for a company a few days ago. It was a simple CRUD application with a React frontend and a Go backend. I really didn't wanna do anymore effort than necessary. Just put both of them up inside an EC2 instance along with the database. It turns out that I didn't know enough about docker to make it anymore easier than I thought.
The Structure
The project is a monorepo with a client and server folder, and also a docker-compose.yml file at the root.
∟ client
∟ server
∟ docker-compose.yml
The docker-compose job is to build both the client and the server, and also run a MySQL image, along with a migration tool (more on this on the database section).
The Client
Dockerizing the client is a breeze, as I was using Vite and there was plenty of documentation on how to deploy it. The problem comes during the building proccess. Here is the dockerfile I used.
FROM node:18-alpine AS base
# Install dependencies only when needed
FROM base AS deps
RUN apk add --no-cache libc6-compat
WORKDIR /app
# Install dependencies based on the preferred package manager
COPY package.json yarn.lock* package-lock.json* pnpm-lock.yaml* ./
ARG VITE_BASE_URL
ENV VITE_BASE_URL=${VITE_BASE_URL}
RUN if [ -f yarn.lock ]; then yarn --frozen-lockfile; \
elif [ -f package-lock.json ]; then npm ci; \
elif [ -f pnpm-lock.yaml ]; then corepack enable pnpm && pnpm i --frozen-lockfile; \
else echo "Lockfile not found." && exit 1; \
fi
# Rebuild the source code only when needed
FROM base AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN if [ -f yarn.lock ]; then yarn run build; \
elif [ -f package-lock.json ]; then npm run build; \
elif [ -f pnpm-lock.yaml ]; then corepack enable pnpm && pnpm run build; \
else echo "Lockfile not found." && exit 1; \
fi
FROM nginx:1.21.0-alpine as production
ENV NODE_ENV production
# Copy built assets from builder
COPY --from=builder /app/dist /usr/share/nginx/html
# Add your nginx.conf
COPY nginx.conf /etc/nginx/conf.d/default.conf
# Expose port
EXPOSE 80
# Start nginx
CMD ["nginx", "-g", "daemon off;"]
That's pretty long, so let's split it and walk through it parts by parts. As you may have guessed, I employed multi-stage builder to minimize the final image size. Let's look at the first stage.
Dependencies stage
FROM node:18-alpine AS base
# Install dependencies only when needed
FROM base AS deps
RUN apk add --no-cache libc6-compat
WORKDIR /app
# Install dependencies based on the preferred package manager
COPY package.json yarn.lock* package-lock.json* pnpm-lock.yaml* ./
ARG VITE_BASE_URL
ENV VITE_BASE_URL=${VITE_BASE_URL}
RUN \
if [ -f yarn.lock ]; then yarn --frozen-lockfile; \
elif [ -f package-lock.json ]; then npm ci; \
elif [ -f pnpm-lock.yaml ]; then corepack enable pnpm && pnpm i --frozen-lockfile; \
else echo "Lockfile not found." && exit 1; \
fi
I used a node:18-alpine image as base. Why did I use version 18? I'm honestly not sure. I used the alpine version for the smaller image size though.
Using that base, I created another stage called deps where I do the usual apk update
in any alpine image.
However, I also installed something called libc6-compat. This is because alpine actually uses musl libc
instead of the usual glibc, which may cause some compatibility issue.
I think the risk is not worth saving a few megabytes of image size, so I just threw it in there.
Next is the fun part. Environment variables in Vite are bundled in during build time, which means it has to be supplied during the
build stage, not during the run stage. I first accepted VITE_BASE_URL
as an argument, and then passed it into the env.
You may have noticed that this is actually the dependency stage, and not the build stage.
The reason why I shoved the env variable here is simply because I am stupid.
Build stage
# Rebuild the source code only when needed
FROM base AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN if [ -f yarn.lock ]; then yarn run build; \
elif [ -f package-lock.json ]; then npm run build; \
elif [ -f pnpm-lock.yaml ]; then corepack enable pnpm && pnpm run build; \
else echo "Lockfile not found." && exit 1; \
fi
This is the actual build stage. I copied the node_modules
from the deps stage, and then copied
the rest of the React app from host into the image.
You may already notice from the previous stage that I use this if-else chain to see if the package manager used is yarn, npm, or pnpm. Honestly, it's overkill. I just use npm anyway.
Run stage
FROM nginx:1.21.0-alpine as production
ENV NODE_ENV production
# Copy built assets from builder
COPY --from=builder /app/dist /usr/share/nginx/html
# Add your nginx.conf
COPY nginx.conf /etc/nginx/conf.d/default.conf
# Expose port
EXPOSE 80
# Start nginx
CMD ["nginx", "-g", "daemon off;"]
Finally, I used an nginx:1.21.0-alpine image as the server. Why did I use this outdated version, when the latest version of the image is 1.26? Again, I'm not sure myself.
I copied the built assets from builder stage and placed it into nginx's html folder. I also copied a nginx.conf
file from host as the config file.
Here is what it looks like.
server {
listen 80;
location / {
root /usr/share/nginx/html/;
include /etc/nginx/mime.types;
try_files $uri $uri/ /index.html;
}
}
Standard stuff, nothing special. It serves on port 80, which is why I exposed it on the dockerfile. I mapped it to the host's port 80 on the docker-compose.yml too later.
Server
The dockerfile for the Go server is much simpler. Who'd've thunk something that compiles to a binary would be easier to build than wrangling an index.js file out of a react app bundled with vite? This is one of the reason I chose to use Go for the backend too. It's just less headache.
Anyway, here's the dockerfile
# Builder
FROM golang:1.22.2-alpine as builder
RUN apk update && apk upgrade && \
apk --update add git make bash build-base
WORKDIR /app
COPY . .
RUN make build
# Distribution
FROM alpine:latest
RUN apk update && apk upgrade && \
apk --update --no-cache add tzdata && \
mkdir /app
WORKDIR /app
EXPOSE 3000
COPY --from=builder /app/engine /app/
CMD /app/engine
Like the client, I had a builder and a runner stage. The builder stage actually utilized a Makefile for some scripting. Here is what it looks like.
build: ## Builds binary
@ printf "Building aplication... "
@ go build \
-trimpath \
-o engine \
./app/
@ echo "done"
Dead simple. It just builds the Go application and spit out an engine
binary into the app
directory.
This binary is then ran in the runner stage with port 3000 exposed.
docker-compose.yml
Now let us combine all of this together + a MySQL database + a db migration.
services:
mysql:
image: mysql:8.3
command: mysqld --user=root
environment:
- MYSQL_DATABASE=${DATABASE_NAME}
- MYSQL_USER=${DATABASE_USER}
- MYSQL_PASSWORD=${DATABASE_PASS}
- MYSQL_ROOT_PASSWORD=${DATABASE_PASS}
ports:
- '3306'
healthcheck:
test: ['CMD', 'mysqladmin', 'ping', '-h', 'localhost']
timeout: 5s
retries: 10
migrate:
image: migrate/migrate
volumes:
- ./server/database/migrations:/migrations
links:
- mysql
depends_on:
mysql:
condition: service_healthy
command:
[
'-path',
'/migrations',
'-database',
'mysql://${DATABASE_USER}:${DATABASE_PASS}@tcp(mysql:3306)/${DATABASE_NAME}?multiStatements=true',
'up',
'3',
]
server:
build:
context: ./server
dockerfile: Dockerfile
environment:
- RUN_MODE=production
- DATABASE_HOST=mysql
- DATABASE_PORT=3306
- DATABASE_NAME=${DATABASE_NAME}
- DATABASE_USER=${DATABASE_USER}
- DATABASE_PASS=${DATABASE_PASS}
ports:
- '3000:3000'
depends_on:
mysql:
condition: service_healthy
client:
build:
context: ./client
dockerfile: Dockerfile
environment:
- VITE_BASE_URL=http://localhost:3000
depends_on:
- server
ports:
- '80:80'
What a doozy. Let's take it apart service by service. Starting with the heart of it all, the mysql service.
MySQL
mysql:
image: mysql:8.3
environment:
- MYSQL_DATABASE=${DATABASE_NAME}
- MYSQL_USER=${DATABASE_USER}
- MYSQL_PASSWORD=${DATABASE_PASS}
- MYSQL_ROOT_PASSWORD=${DATABASE_PASS}
ports:
- '3306'
healthcheck:
test: ['CMD', 'mysqladmin', 'ping', '-h', 'localhost']
timeout: 5s
retries: 10
I think this is quite simple. I'm using the mysql:8.3 image (it's 600 mb), declare some environment variables, told it which port to use, and then do the healthcheck. The healthcheck is critical. Don't skip it.
Migration
migrate:
image: migrate/migrate
volumes:
- ./server/database/migrations:/migrations
links:
- mysql
depends_on:
mysql:
condition: service_healthy
command:
[
'-path',
'/migrations',
'-database',
'mysql://${DATABASE_USER}:${DATABASE_PASS}@tcp(mysql:3306)/${DATABASE_NAME}?multiStatements=true',
'up',
'3',
]
This one is not as simple. You remember the migration tool I mentioned before? Yeah, this is it. The MySQL instance that spawned inside the mysql service is empty. We need to fill it with our tables, and maybe some seed data. In a development environment, it is very convenient to use tools such as this as you can sync your current database with the sql files inside your migrations folder like so.
migrate -source file://path/to/migrations -database postgres://localhost:5432/database up 2
However, it can get a little tricky in a dockerized database. As you can see, I explicitly linked the migrate service with the mysql service.
This is entirely uneccessary. What is necessary, however, is the depends_on
tag. It checks up on mysql service and see if it is
healthy before running.
I set up a volume that mapped a path on host where the migration files are located during development with the migrations folder inside the service.
./server/database/migrations:/migrations
The hostname in the command part is also changed to mysql service name instead of localhost, as docker network services can, and should, reach each other by names.
Server and Client
I'm gonna just do both of them in one section, because at this point, I think you already get the idea.
server:
build:
context: ./server
dockerfile: Dockerfile
environment:
- RUN_MODE=production
- DATABASE_HOST=mysql
- DATABASE_PORT=3306
- DATABASE_NAME=${DATABASE_NAME}
- DATABASE_USER=${DATABASE_USER}
- DATABASE_PASS=${DATABASE_PASS}
ports:
- '3000:3000'
depends_on:
mysql:
condition: service_healthy
client:
build:
context: ./client
dockerfile: Dockerfile
environment:
- VITE_BASE_URL=http://${HOST_IP}:3000
depends_on:
- server
ports:
- '80:80'
Both of them are assigned a context to their respective directory, with the dockerfile name explicitly spelled out. The server depends on mysql service, and the client depends on the server. You may have noticed something amiss here. The server does not have a healthcheck, which means the client will never know if server is actually healthy or not. It just assumes it is. Don't do this in production. Always healthcheck.
You may also notice that the environment for client is using your public ipv4 and not the server service name. This loops back around to what I said about how Vite bundles up environment variables during build time, which is way before any of these services are up. Furthermore, if I kept it as localhost, then when someone halfway across the world access the client, the client will attempt to access that person's localhost. Ideally, you'd want to point this variable to a secure dns routed path instead of rawdogging the unsecure ipv4. Alas, I didn't have enough time before the technical test deadline.
Deployment
I just used the free tier EC2 instance with the intent of kicking it once the reviewer from the technical test is done. I didn't even dns it to a custom domain. Just sent their HR a suspicious ipv4 http link. Seems to be working because they asked for an interview the very next day. C'est la vie.