From 002387bc650e6c722c3998c5e4c5d83d25930650 Mon Sep 17 00:00:00 2001
From: Matt Wang <matt@matthewwang.me>
Date: Wed, 21 Dec 2022 13:16:27 -0800
Subject: [PATCH] ci: remove `jekyll/jekyll` container (#1072)

tl;dr:

- replaces `jekyll/jekyll` with actions-native container
- changes Jekyll version targets to be explicit: `[3.9, 4.3]`
- fixes CI bug from today

I've **temporarily changed the branch protection rules** so we can merge this PR in.

## Motivations

A handful of motivations for me doing this:

- immediately: in writing #1071, I noticed that our CI is broken; this is due to an issue with the `jekyll/jekyll` container (see below)
- generally: I would like to make our CI as close to what our users would be using; I think the vast majority of our users use pages or actions, rather than the container itself
- nit: building `3.8.5` is *really slow* (takes ~ 3 minutes), and now each matrix strategy takes ~ 30 seconds with no cache!

More on each of those points now!

### Immediate Problem: `sass-embedded`

Starting today, CI fails on `main`. #1071 has one example ([CI log](https://github.com/just-the-docs/just-the-docs/actions/runs/3752287338/jobs/6374267086)), and I was able to repro this on a whitespace change with the `README`.

Several notes:

- building locally, with pages, or with Netlify does not fail; for example, the deploy preview for #1071 still works
- the error has to do with native extensions (of which `sass-embedded` is one of them) not being built properly for musl c linux; changing the pinned version of `sass-embedded` to a previous version does *not* resolve this problem
- it's not transparent what has changed: the `jekyll/jekyll` image itself has not been updated in over a month, but, something in the container (which was not pinned) must have changed and forced an error with compilation

Given that *our users* won't encounter this problem, this reads easily as a CI problem to me - we should resolve our pipeline (and not change user code/artifacts).

### Bigger Picture: `jekyll/jekyll`

Disclaimer: I have never really liked that we use `jekyll/jekyll` as part of our CI pipeline. Let me outline why and why I think this change is a good idea.

Broadly, our CI should closely match how our users use just the docs. After fielding support requests, looking at our big players, etc., I think the most common deployment methods are:

1. using native GitHub Pages. Recently, [GitHub has moved this to use GitHub Actions and their containers](https://github.blog/2022-08-10-github-pages-now-uses-actions-by-default)
2. using GitHub Actions explicitly, like what's provided in our template
3. "out-of-the-box" CD from providers like Netlify, Vercel, or Fly

In contrast, I have met few users using the `jekyll/jekyll` or `jekyll/builder` containers; of course, this is still anecdotal.

Thus, I think our CI should match the common denominator in the vast majority of our user base, which is *not* using the community container, but instead a standard linux container + bolting on Ruby tooling. Given that GitHub Pages is likely our biggest userbase, I think we should match it - which means, using Actions-specific containers (what's in this PR).

Furthermore, I think it's more likely that a user who wants a container is more knowledgeable about deployment and can resolve problems not caught by CI by themselves and/or submit an issue to GitHub, so switching off of this is fine; now, our CI will better match users who are less familiar with CD (and are likely to just use Pages out of the box).

I also will point out that `jekyll/jekyll` is **not a first-party container**, even though the namespace makes it seem like it is; it's maintained by [envygeeks](https://github.com/envygeeks/jekyll-docker). While this is very kind of them, it adds an external dependency that I would prefer to avoid.

### Build Time and `3.8.5`

This is short, but `3.8.5` isn't the latest `3` release; it's not even the latest minor patch. Since it's a "stale" container, pulling it takes a really long time (upwards of 3 minutes).

Bumping to the latest minor shouldn't affect downstream users, and has faster CI for us - which means quicker dev turnaround :)
---
 .github/workflows/ci.yml | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index ed0bb9a9..f1f92d34 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -15,14 +15,22 @@ jobs:
     runs-on: ubuntu-latest
     strategy:
       matrix:
-        jekyll-version: [3.8.5, latest]
+        jekyll-version: [3.9, 4.3]
     steps:
     - uses: actions/checkout@v3
-    - name: Build the site in the jekyll/jekyll container
-      run: |
-        docker run --rm \
-        --volume="${{ github.workspace }}:/srv/jekyll" \
-        jekyll/jekyll:${{ matrix.jekyll-version }} /bin/bash -c "gem install bundler && chmod -R 777 /srv/jekyll && bundle install && bundle exec jekyll build && bundle exec rake search:init"
+    - name: Setup Ruby
+      uses: ruby/setup-ruby@v1
+      with:
+        ruby-version: '3.1' # Not needed with a .ruby-version file
+        bundler-cache: false
+    - name: Bundle Install
+      run: bundle install
+    - name: Install Jekyll ${{ matrix.jekyll-version }}
+      run: gem install jekyll -v ${{ matrix.jekyll-version }}
+    - name: Init Search
+      run: bundle exec rake search:init
+    - name: Build Site
+      run: bundle exec jekyll build
 
   assets:
     name: Test CSS and JS
-- 
GitLab