AWS CodePipeline for React App

January 21, 2021

Description:

Creating a CI/CD Pipeline for a React application using AWS CodePipeline

Intro
CodePipeline

Intro

🛑 The method described in this article is no longer my recommended method. I now strongly recommend that you deploy a React application using static files in an S3 bucket as described here:

📘 ncoughlin: AWS CodePipeline for React App to S3 Bucket

If you are dead set on deploying a React app on an EC2 instance this article may be helpful, otherwise please see above.

This article is going to cover the process of creating a CI/CD pipeline for a React application using AWS CodePipeline. This will cover the complete process for a Node/React application, including the creation of the appspec.yml and buildspec.yml files, as well as the shell scripts that the buildspec.yml file requires.

A continuous integration pipeline is a system that we setup so that when we push commits to a repo, the code gets built and deployed to a server automatically.

Let us imagine what the process would be to update the code on an AWS EC2 instance without a system like this in place. First we would commit our code changes to our repo (wherever that is), then we would manually log in to our EC2 server using our SSH credentials and using the CLI we would stop the server, pull our changes from our repo manually and then run the build command again. And this is just the simplest scenario where we are not dealing with test environments, multiple developers, or horizontally scaled instances being managed by a load balancer.

Besides removing a lot of the manual labour from the process, the real benefit of getting this system set up on AWS is that once our application has scaled up a bit horizontally and we have multiple EC2 instances running, with traffic being directed to them by a load balancer, an AWS CodePipeline is designed to work with the Load Balancer to automatically redirect traffic to other instances while one of them is being updated. In this way your servers get updates with your new code and your users never experience any down time.

There are some other services that abstract a lot of the complexity of this process away, for example Netlify (where this website is hosted) is setup so that every time I push a commit, it detects that change in the repo, pulls it, builds it and deploys it. I don't have to deal with any of that manually. Netlify also makes that process very simple, which is one of the reasons why Netlify is so popular for static websites. However while Netlify vastly simplifies this process, it does so by obscuring away a lot of the control that is required for commercial applications, and would therefore not be appropriate for a true web application. Let us briefly list some of the items that we are in complete control of by manually setting up a CodePipeline

The exact repository and branch that should listen for code changes
The exact kind of changes that branch receives
The operating system of our servers
What version of node we are using
Compatible with Docker images (optional)
Compatible with Jenkins (optional)
Can use custom non-publicly available packages during the build phase
Integrates with CloudWatch
Integrates with AWS Load Balancers
Ability to chain custom shell scripts at multiple phases of deployment (required)
Ability to deploy only specific files from latest build into specific locations on servers (optional)
The ability to deploy to groups of horizontally scaled EC2 instances

So we can see that there are some advantages to setting all this up manually. As a note, AWS also offers a few services to simplify this process, such as Amplify and CodeStar. However they accomplish this the same way that Netlify does. By obscuring away the details that give us control. Therefore we are going to focus here on the full process using CodePipeline.

CodePipeline

Important Notes

A brief note on naming conventions. As you are going through and creating projects and pipelines, I recommend a naming convention that takes into account your project as well as your production environment. For example myproject-DEV and myproject-TEST and myproject-LIVE as opposed to just myproject

CodePipeline is the glue that ties the following four developer tools together. In order to create a complete pipeline you will need to go through the following steps:

Source - CodeCommit

Artifacts - CodeArtifact

Build - CodeBuild

Deploy - CodeDeploy

You will have settings that you need to configure in each of these stages, and then in the pipeline service you simply select the application that you saved in each of these previous steps. Let's take a quick walk through of what we need in each of these sections.

Source - CodeCommit

This one is pretty simple, you are simply linking to your Github Account and selecting a repository inside of that account, as well as a branch on that repository. We can set it up so that the pipeline is triggered when a change is pushed to a particular branch. For example I have set up my most recent application to trigger the pipeline when I save my code to the branch titled "live".

CodeCommit is the part of the pipeline that is going to listen for changes to the repo and then initiate the other steps in the pipeline.

Artifacts - CodeArtifact

CodeArtifact is the worst named service, probably of all time. Not to be confused with AWS Artifact which manages compliance reports... AWS CodeArtifact is a software package repository. They made it so that companies have places to store any custom or proprietary software packages that they use, in addition to the publicly available ones (npm packages). So basically it combines all of your custom software packages and your public ones into one repository, that your application then pulls from during the build stage.

Basically it just creates an S3 bucket for you and you give it a name. Next.

Build - CodeBuild

CodeBuild is the service that fetches your repository, gathers your artifacts from CodeArtifact and then builds a container to build the new version of your application.

CodeBuild is one of the more confusing part of this pipeline process and I was totally stonewalled here for a bit until I wrapped my head around what is happening here. Going into this whole process I assumed that the build of the application would happen on the actual EC2 instance where the application is hosted. Just like I build the application on my local machine. However that is not the case. CodeBuild takes your repository files, and the Artifacts (packages) and then provisions a container temporarily for the sole purpose of building the application files. Then those built files are deployed to the EC2 instance that is hosting the application using CodeDeploy.

Understanding this is key to understanding this whole process.

Codebuild requires your project to be in a container to build it. A container being a virtualized operating system that is running the environment you need, like Node running on Ubuntu. You will be given two choices here. You can either be using Docker to specify everything about your container, and upload a Docker container to the Codebuild project, or you can use AWS's little proprietary container system, where you specify the container settings in a special file called buildspec.yml. Codebuild will find that file and construct a container based on the settings you have put into that file.

And if we think about it, this makes sense. The build process takes time. One of the goals of this system is to have our EC2 servers running the actual application to be down for as little time as possible. If we complete the build somewhere else and then just quickly swap in the built files, our application will be offline for the smallest amount of time possible. And if we are building the application somewhere else, we know that building the application requires that we have an environment running node.

This is the part that gets a bit confusing. We need to give CodeBuild some very specific instructions, such as what version of Node our application requires, and what our build commands are. CodeBuild does not assume any of these things. The way that we give CodeBuild these instructions is using a buildspec.yml file. You can read more about those files here.

📘 AWS Docs: Buildspec File Reference.

Here is a sample of my current buildspec file.

buildspec.yml
version: 0.2

phases:
  install:
    runtime-versions:
      nodejs: 12
   
    commands:
        # install npm
        - npm install
       
  build:
    commands:
        # run build script
        - npm run-script build
     
artifacts:
  # include all files required to run application
  # notably excluded is node_modules, as this will cause overwrite error on deploy
  files:
    - public/**/*
    - src/**/*
    - package.json
    - appspec.yml
    - scripts/**/*

This article is not a replacement for the actual documentation linked above, so i'm not going to delve too deeply into specifics on this. But in a nutshell we can see that we have specified our version of node, we have used npm to install all of our app files, and then we have triggered the build process.

Most of the items in this script should be obvious. However I have found that it is critical in the artifacts > files section to exclude the node_modules folder. And to do that we include all files except the node_modules folder. If you include the node modules folder you will get an overwrite error in the deploy stage that is very tricky to solve.

This will not cause any problems with the application because the node_modules folder is not used by the built application. It is only used by Webpack in the build phase, where it grabs everything it needs, tree shakes out anything it doesn't and then bundles all those items into the src and public folders. Therefore we do not need to include the node_modules folder in our production code.

Deploy - CodeDeploy

The next step is to decide where our built application is going to be deployed. We can pick between an EC2 instance or a local server. One thing that is an extra step here is that if we are not using an Amazon EC2 instance we need to first go into the server and install the AWS Systems Manager Agent. If you are using an Amazon EC2 instance this will be pre-installed. What will not be pre-installed is the CodeDeploy Agent. If we don't get that package installed on the instance our pipeline will hit a metaphorical brick wall when we try to deploy our application. I did not have much success using the techniques described in the Amazon Documentation to get this package installed, so I installed it manually by going into the console of that instance and using the following commands.

sudo yum update
sudo yum install ruby
sudo yum install wget
wget https://aws-codedeploy-us-east-1.s3.amazonaws.com/latest/install
chmod +x ./install
sudo ./install auto

So what is the process of transferring the build onto my actual EC2 server then? This part of the process is handled by CodeDeploy. CodeDeploy requires that you create another .yml file called appspec, which is where you designate what files to copy over from your new build, where to place them, and set of commands (hooks) to run during the different deploy stages. This aspect is very similar to the buildspec.yml file, where you have also have to specify commands.

📘 AWS Docs: appspec file reference

The deploy stages are required because the EC2 server must obviously be stopped. Then the new files are installed, and then the server is started again. Here is a sample appspec file

appspec.yml
# This is an appspec.yml template file for use with an EC2/On-Premises deployment in CodeDeploy.
# https://docs.aws.amazon.com/codedeploy/latest/userguide/app-spec-ref.html
version: 0.0

os: linux 

files:
  - source: /
    destination: /app
    overwrite: true

permissions:
  - object: /
    pattern: "**"
    owner: ec2-user
    group: ec2-user

hooks:

  BeforeInstall:
    - location: scripts/before_install.sh
      timeout: 300
      runas: root

  AfterInstall:
      - location: scripts/after_install.sh
        timeout: 1000
        runas: root

  ApplicationStart:
    - location: scripts/start_server.sh     
      timeout: 300
      runas: root

not that for the hooks section, we cannot simply inline the shell commands like we did in the buildspec file. We must create a scripts folder and then put our scripts into files in that location. Here are my final script files that I included in my scripts folder.

before_install.sh
#!/bin/bash

# navigate to app folder
cd /app

# install node and npm
curl -sL https://rpm.nodesource.com/setup_14.x | sudo -E bash -
yum -y install nodejs npm

One of the critical steps is that we cannot install the application directly into the root of the EC2 instance. It just doesn't work for reasons not worth getting into. You must create a folder in the EC2 instance root, which in this example is called app and then at the start of every shell script you must navigate into that folder before you execute your commands. You must navigate into the app folder at the start of every shell script, as your location in the directory resets in between steps. That is the reason this command is repeated. This also makes sense when you understand that each EC2 instance can actually run multiple node processes (applications) at the same time, available on different ports. So seperating them all out into different directories in the root is good practice anyway.

after_install.sh
#!/bin/bash

# navigate to app folder
cd /app

# install dependencies
npm install

# install create-react-app and react-scripts
# without react-scripts application cannot be started
npm install --save create-react-app react-scripts

# install pm2 process manager
npm install pm2 -g

Notably in this script we must install react-scripts or else our npm start or yarn start script will not work. And we are also going to install PM2 here, which is a process manager which will allow us to make sure that the application stays up and running. Technically not required, but practically we must have SOME process manager making sure that the application gets started again if it stops for some reason.

start_server.sh
#!/bin/bash

# navigate to app folder
cd app

# initial startup by running react-script "start", name process "marketing"
# --watch watches and restarts if files change
pm2 start ./node_modules/react-scripts/scripts/start.js --name "marketing" --watch

# auto restart server if shut down
pm2 startup

# freeze process list for automatic respawn
pm2 save

# restart all processes - necessary to do this again?
pm2 restart all

This is where we start the application. If you weren't using PM2 this would be a simple npm start script here, but because we are using PM2 it looks like this.

Notably absent from this is a stop_server.sh script file. I have found that you don't actually need one. CodeDeploy is perfectly capable of stopping the node server without that script.

Honestly this whole deploy section is actually pretty tricky and it takes a lot of trial and error to get the deploy to work correctly. You will spend a LOT of time tinkering with your shell scripts for the different stages. There is another really good tutorial on this whole process here:

Deploy a ReactJS Application to AWS EC2 Instance using AWS CodePipeline

Then you need to modify your security group for your instance to make the application available, and then the load balancer to manage mapping the URL to correct ports. I cover that whole process in another article here.

📘 Ncoughlin: AWS Load Balancers and Security Groups.

Automated Amazon Reports

Automatically download Amazon Seller and Advertising reports to a private database. View beautiful, on demand, exportable performance reports.

bidbear.io

Intro​

CodePipeline​

Important Notes​

Source - CodeCommit​

Artifacts - CodeArtifact​

Build - CodeBuild​

Deploy - CodeDeploy​