Skip to content
Snippets Groups Projects
  • Matt Wang's avatar
    002387bc
    ci: remove `jekyll/jekyll` container (#1072) · 002387bc
    Matt Wang authored
    tl;dr: 
    
    - replaces `jekyll/jekyll` with actions-native container
    - changes Jekyll version targets to be explicit: `[3.9, 4.3]`
    - fixes CI bug from today
    
    I've **temporarily changed the branch protection rules** so we can merge this PR in.
    
    ## Motivations
    
    A handful of motivations for me doing this:
    
    - immediately: in writing #1071, I noticed that our CI is broken; this is due to an issue with the `jekyll/jekyll` container (see below)
    - generally: I would like to make our CI as close to what our users would be using; I think the vast majority of our users use pages or actions, rather than the container itself
    - nit: building `3.8.5` is *really slow* (takes ~ 3 minutes), and now each matrix strategy takes ~ 30 seconds with no cache!
    
    More on each of those points now!
    
    ### Immediate Problem: `sass-embedded`
    
    Starting today, CI fails on `main`. #1071 has one example ([CI log](https://github.com/just-the-docs/just-the-docs/actions/runs/3752287338/jobs/6374267086)), and I was able to repro this on a whitespace change with the `README`.
    
    Several notes:
    
    - building locally, with pages, or with Netlify does not fail; for example, the deploy preview for #1071 still works
    - the error has to do with native extensions (of which `sass-embedded` is one of them) not being built properly for musl c linux; changing the pinned version of `sass-embedded` to a previous version does *not* resolve this problem
    - it's not transparent what has changed: the `jekyll/jekyll` image itself has not been updated in over a month, but, something in the container (which was not pinned) must have changed and forced an error with compilation
    
    Given that *our users* won't encounter this problem, this reads easily as a CI problem to me - we should resolve our pipeline (and not change user code/artifacts).
    
    ### Bigger Picture: `jekyll/jekyll`
    
    Disclaimer: I have never really liked that we use `jekyll/jekyll` as part of our CI pipeline. Let me outline why and why I think this change is a good idea.
    
    Broadly, our CI should closely match how our users use just the docs. After fielding support requests, looking at our big players, etc., I think the most common deployment methods are:
    
    1. using native GitHub Pages. Recently, [GitHub has moved this to use GitHub Actions and their containers](https://github.blog/2022-08-10-github-pages-now-uses-actions-by-default)
    2. using GitHub Actions explicitly, like what's provided in our template
    3. "out-of-the-box" CD from providers like Netlify, Vercel, or Fly
    
    In contrast, I have met few users using the `jekyll/jekyll` or `jekyll/builder` containers; of course, this is still anecdotal.
    
    Thus, I think our CI should match the common denominator in the vast majority of our user base, which is *not* using the community container, but instead a standard linux container + bolting on Ruby tooling. Given that GitHub Pages is likely our biggest userbase, I think we should match it - which means, using Actions-specific containers (what's in this PR).
    
    Furthermore, I think it's more likely that a user who wants a container is more knowledgeable about deployment and can resolve problems not caught by CI by themselves and/or submit an issue to GitHub, so switching off of this is fine; now, our CI will better match users who are less familiar with CD (and are likely to just use Pages out of the box).
    
    I also will point out that `jekyll/jekyll` is **not a first-party container**, even though the namespace makes it seem like it is; it's maintained by [envygeeks](https://github.com/envygeeks/jekyll-docker). While this is very kind of them, it adds an external dependency that I would prefer to avoid.
    
    ### Build Time and `3.8.5`
    
    This is short, but `3.8.5` isn't the latest `3` release; it's not even the latest minor patch. Since it's a "stale" container, pulling it takes a really long time (upwards of 3 minutes).
    
    Bumping to the latest minor shouldn't affect downstream users, and has faster CI for us - which means quicker dev turnaround :) 
    ci: remove `jekyll/jekyll` container (#1072)
    Matt Wang authored
    tl;dr: 
    
    - replaces `jekyll/jekyll` with actions-native container
    - changes Jekyll version targets to be explicit: `[3.9, 4.3]`
    - fixes CI bug from today
    
    I've **temporarily changed the branch protection rules** so we can merge this PR in.
    
    ## Motivations
    
    A handful of motivations for me doing this:
    
    - immediately: in writing #1071, I noticed that our CI is broken; this is due to an issue with the `jekyll/jekyll` container (see below)
    - generally: I would like to make our CI as close to what our users would be using; I think the vast majority of our users use pages or actions, rather than the container itself
    - nit: building `3.8.5` is *really slow* (takes ~ 3 minutes), and now each matrix strategy takes ~ 30 seconds with no cache!
    
    More on each of those points now!
    
    ### Immediate Problem: `sass-embedded`
    
    Starting today, CI fails on `main`. #1071 has one example ([CI log](https://github.com/just-the-docs/just-the-docs/actions/runs/3752287338/jobs/6374267086)), and I was able to repro this on a whitespace change with the `README`.
    
    Several notes:
    
    - building locally, with pages, or with Netlify does not fail; for example, the deploy preview for #1071 still works
    - the error has to do with native extensions (of which `sass-embedded` is one of them) not being built properly for musl c linux; changing the pinned version of `sass-embedded` to a previous version does *not* resolve this problem
    - it's not transparent what has changed: the `jekyll/jekyll` image itself has not been updated in over a month, but, something in the container (which was not pinned) must have changed and forced an error with compilation
    
    Given that *our users* won't encounter this problem, this reads easily as a CI problem to me - we should resolve our pipeline (and not change user code/artifacts).
    
    ### Bigger Picture: `jekyll/jekyll`
    
    Disclaimer: I have never really liked that we use `jekyll/jekyll` as part of our CI pipeline. Let me outline why and why I think this change is a good idea.
    
    Broadly, our CI should closely match how our users use just the docs. After fielding support requests, looking at our big players, etc., I think the most common deployment methods are:
    
    1. using native GitHub Pages. Recently, [GitHub has moved this to use GitHub Actions and their containers](https://github.blog/2022-08-10-github-pages-now-uses-actions-by-default)
    2. using GitHub Actions explicitly, like what's provided in our template
    3. "out-of-the-box" CD from providers like Netlify, Vercel, or Fly
    
    In contrast, I have met few users using the `jekyll/jekyll` or `jekyll/builder` containers; of course, this is still anecdotal.
    
    Thus, I think our CI should match the common denominator in the vast majority of our user base, which is *not* using the community container, but instead a standard linux container + bolting on Ruby tooling. Given that GitHub Pages is likely our biggest userbase, I think we should match it - which means, using Actions-specific containers (what's in this PR).
    
    Furthermore, I think it's more likely that a user who wants a container is more knowledgeable about deployment and can resolve problems not caught by CI by themselves and/or submit an issue to GitHub, so switching off of this is fine; now, our CI will better match users who are less familiar with CD (and are likely to just use Pages out of the box).
    
    I also will point out that `jekyll/jekyll` is **not a first-party container**, even though the namespace makes it seem like it is; it's maintained by [envygeeks](https://github.com/envygeeks/jekyll-docker). While this is very kind of them, it adds an external dependency that I would prefer to avoid.
    
    ### Build Time and `3.8.5`
    
    This is short, but `3.8.5` isn't the latest `3` release; it's not even the latest minor patch. Since it's a "stale" container, pulling it takes a really long time (upwards of 3 minutes).
    
    Bumping to the latest minor shouldn't affect downstream users, and has faster CI for us - which means quicker dev turnaround :)