Code splitting for highlight.js in markdown-it

January 4, 2024 2130 Words

Introduction

I started building my blog site letmutex.com recently. If you search ‘letmutex’ on Google, you will see a lot of Rust pages, that’s the original idea of ‘letmutex’, it starts with the Rust’s let keyword and ends with a variable named mutex.

If I got lucky, you might also see this site on Google’s first page results one day.

Because of the name ‘letmutex’, I intended to use it for sharing some programming blogs.

There will be lots of code on the blog pages, and rich texts are also needed, so Markdown will be my best choice to store content, it’s simple and has great library and editor support. I can write my blogs on Notion or in VS Code and upload them to my database, then just npm install a Markdown library to render them on my website.

Markdown it

After picking the right frontend framework, which is Vue.js this time, I wanted a nice and simple markdown rendering library.

After some searching, I decided to give markdown-it a try. Just same as its introductions, it’s dead simple to use it with Vue:

typescript

import markdownit from 'markdown-it'
// A markdown theme CSS is needed, markdown-it is just a parsing library.
// I use the old good GitHub theme: https://github.com/sindresorhus/github-markdown-css
import "github-markdown-css/github-markdown-light.css"

const md = markdownit()
const html = md.render(`
# Hi, these are not my words
> Stay Hungry, Stay Foolish.
`);

html

<template>
  <div v-html="html" class="markdown-body"></div>
</template>

It will render this:

Rendered

Seems good, another thing I need is syntax highlighting for code blocks. By default, markdown-it renders code as plain text, as the README mentioned, we could use highlight.js for that.

typescript

import markdownit from 'markdown-it';
import "github-markdown-css/github-markdown-light.css";
import hljs from 'highlight.js';
import 'highlight.js/styles/github.css';

const md = markdownit({
  highlight: function (str: string, lang: string) {
    if (lang && hljs.getLanguage(lang)) {
      try {
        return hljs.highlight(str, { language: lang }).value;
      } catch (err) {
        // Handle error
      }
    }
    return '';
  }
});

const html = md.render(`
\`\`\`typescript
function hi() {
  return "Hi"
}
console.log(\`\${hi()}, Jack.\`)
\`\`\`
`);

After the tweaking, we got this 99.78% GitHub-like syntax highlighting!

Rendered code block

The problem

Our markdown app is powerful enough to render a GitHub readme in just 50 lines. Nothing can stop us from building our initial App. After running the build command, we immediately see a number of a size: 1,054.33 kB

vite v5.0.10 building for production...
✓ 285 modules transformed.
dist/index.html                     0.43 kB │ gzip:   0.29 kB
dist/assets/index-AKS7n1Tl.css     19.35 kB │ gzip:   4.37 kB
dist/assets/index-mMqPtPCj.js   1,054.33 kB │ gzip: 364.47 kB

Wait, what? A 1MB single JavaScript file for our hello-world markdown rendering app? That's terrible! Visitors might close our page before that JavaScript file finishes loading.

Vite also complains about it:

(!) Some chunks are larger than 500 kB after minification. 

The good part is that Vite also provides us with solutions:

Consider:
- Using dynamic import() to code-split the application
- Use build.rollupOptions.output.manualChunks to improve chunking: https://rollupjs.org/configuration-options/#output-manualchunks
- Adjust chunk size limit for this warning via build.chunkSizeWarningLimit.

These messages appear to be very helpful. Let's try all of them!

Solutions

Dynamic import

When using dynamic import, the target module will not be merged with the current file when packing. Instead, it will only be imported and evaluated when it is 'called'. Thus, maybe we can dynamic import hljs in markdown-it’s highlight function, then we could get something like this:

typescript

const md = markdownit({
  // Too bad, this function won't be accepted by markdown-it!
  highlight: async function (str: string, lang: string) {
    const hljs = await import('highlight.js');
    // Then render our code
   }
});

It looks good, but unfortunately, markdown-it’s highlight function does NOT accept an async function for now:

Type '(str: string, lang: string) => Promise<any>' is not assignable 
to type '(str: string, lang: string, attrs: string) => string'.

So sad, we have no choice but to give up dynamic import for now.

Rollup manualChunks

I don’t have much knowledge about Rollup and Vite, I just opened the link in Vite’s warning message and roughly read the Rollup doc, then put these options in my vite.config.ts:

typescript

export default defineConfig({
  ...,
  build: {
    rollupOptions: {
      output: {
        manualChunks: {
          hljs: ['highlight.js']
        }
      }
    }
  }
})

Let’s build our App again and see what we got:

vite v5.0.10 building for production...
✓ 285 modules transformed.
dist/index.html                   0.51 kB │ gzip:   0.31 kB
dist/assets/index-AKS7n1Tl.css   19.35 kB │ gzip:   4.37 kB
dist/assets/index-o5VoHmsJ.js   141.17 kB │ gzip:  63.25 kB
dist/assets/hljs-esXiWaOo.js    912.84 kB │ gzip: 301.56 kB

We got a slightly smaller index.js, but this won’t help the initial page load time, all JS files need to be loaded on the initial load, using separate requests. Conversely, this may hurt our loading time.

Disable warnings

This option is really neat since it disables the warning message when building, even if our index.js is 10 MB.

typescript

export default defineConfig({
  ...,
  build: {
    chunkSizeWarningLimit: 20 * 1024, // kB
  }
})

No more warning messages when building:

vite v5.0.10 building for production...
✓ 285 modules transformed.
dist/index.html                     0.43 kB │ gzip:   0.29 kB
dist/assets/index-AKS7n1Tl.css     19.35 kB │ gzip:   4.37 kB
dist/assets/index-mMqPtPCj.js   1,054.33 kB │ gzip: 364.47 kB
✓ built in 5.65s

The only disadvantage of this approach is that it does not solve the problem.

Source of the problem

We found that the source of the problem is highlight.js, it packed all languages support by default, as a result, the compressed hljs.js is 912.84 kB (we got this number in the manualChunks section).

The problem is, does it provide a way to import only the languages we need?

Yes, it does:

typescript

import hljs from 'highlight.js/lib/core';
import typescript from 'highlight.js/lib/languages/typescript';
hljs.registerLanguage('typescript', typescript);

As a type guy, I only need TypeScript highlighting in my blog, let’s update our code and build.

dist/assets/index-HntAe_9o.js   169.75 kB │ gzip: 73.86 kB

Wow, our index.js is only 169 kB now, and the TypeScript syntax highlighting just works.

Our blog App is tiny and good again…

But, wait, I lied, I don’t write JavaScript in this blog doesn’t mean I don’t need it later, I still need syntax highlighting for JavaScript, HTML, CSS, and even Java, Kotlin, Rust, Diff, Bash, C

Let’s add all these languages and build again:

dist/assets/index--WJ0Y9de.js   201.18 kB │ gzip: 81.78 kB

Not bad, after adding another 10 languages, our index.js only increased by about 30 kB, it’s acceptable.

But another 10 languages is not enough for real programmers, we still need more language highlighting even if we may never use them.

And then, the REAL question comes in:

I need them but I don’t need to import all of them when they are not used!!!

That’s why we still need dynamic import, but markdown-it doesn’t seem to give us a chance to use it.

The Hacky dynamic import

We all know we don’t need always to await a Promise, we only need to await to get some results,

Otherwise, then chains on Promises work well.

Can we dynamically import our language support from hljs and highlight our code in the then callback?

markdown-it won’t be happy if we don’t do things synchronously. Thus, we could leave markdown-it free, import our language support from hljs, then highlight our code and update our DOM manually with the highlighted HTML.

  1. We start with a long function doing dynamic import and highlight

    typescript

    import hljs from 'highlight.js/lib/core'
    
    export async function asyncHighlight(code: string, lang: string): Promise<string | null> {
      let promise: Promise<any | null> | null = null
      // We cannot put lang to our dynamic import statement, the packing tool
      // won't be happy if we did that. We need to make the import explicit,
      // so the packing tool will generate these language chunks.
      switch (lang) {
        case 'typescript': {
          promise = import('highlight.js/lib/languages/typescript')
          break
        }
        case 'javascript': {
          promise = import('highlight.js/lib/languages/javascript')
          break
        }
        // Other languages
      }
      if (promise == null) {
        // We don't know or don't need this language
        return null
      }
      return await promise
        .then((language) => {
          if (language != null) {
            // The language module is imported
            hljs.registerLanguage(lang, language.default)
            return true
          } else {
            return false
          }
        })
        .then((shouldHighlight) => {
          if (shouldHighlight) {
            // Highlighting is available now
            return hljs.highlight(code, { language: lang }).value
          } else {
            return null
          }
        })
        .catch((e) => {
          // Failed to import or highlight
          return null
        })
    }
    
  2. Then use this function in our markdown-it's highlight function.

    typescript

    const md = markdownit({
      html: true,
      linkify: true,
      highlight: function (str, lang) {
        if (lang == null) {
          return ''
        }
        highlightAsync(str, lang)
          .then((rendered) => {
            if (rendered == null || onHighlighted == null) return
            onHighlighted(str, lang, rendered)
          });
        return ''
      }
    });
    
  3. Finally update our code elements.

    typescript

    function onHighlighted(code: string, lang: string, highlighted: string) {
      // Find the target <code></code> element and replace innerHTML with
      // the rendered html.
      const trimmedCode = code.trim();
      const elements = document.querySelectorAll(`code.language-${lang}`);
      for (const element of elements) {
        if (element.innerHTML.trim() === trimmedCode) {
          // Find it!
          element.innerHTML = highlighted;
          break;
        }
      }
    }
    

As expected, we got these small language chunks after building:

vite v5.0.10 building for production...
✓ 102 modules transformed.
dist/index.html                       0.43 kB │ gzip:  0.29 kB
dist/assets/index-AKS7n1Tl.css       19.35 kB │ gzip:  4.37 kB
dist/assets/diff-aRxUzNe7.js          0.51 kB │ gzip:  0.30 kB
dist/assets/xml-kUQH45Sg.js           1.90 kB │ gzip:  0.77 kB
dist/assets/java-nUofamjH.js          2.58 kB │ gzip:  1.19 kB
dist/assets/rust-zXoPfVCz.js          2.83 kB │ gzip:  1.34 kB
dist/assets/kotlin-OzTi70Qj.js        3.32 kB │ gzip:  1.41 kB
dist/assets/c-SOWtrULx.js             3.77 kB │ gzip:  1.75 kB
dist/assets/javascript-pdMn0Y43.js    6.44 kB │ gzip:  2.57 kB
dist/assets/typescript-9CvkCt7Z.js    7.53 kB │ gzip:  2.98 kB
dist/assets/css-hEDjnBe-.js           9.85 kB │ gzip:  3.35 kB
dist/assets/index-OlbCXtkk.js       164.16 kB │ gzip: 72.06 kB
✓ built in 2.23s

Only the TypeScript chunk was loaded on our blog page! we did it!

Network inspector

What about doing it in React?

No need to change anything, there is no Vue-specific code in these functions.

Sum up

We did dynamic imports to import highlight.js language chunks in markdown-it's highlight() function, but without awaiting the import() in the highlight function due to the limitation of markdown-it. Instead, we used the then() chain to update our DOM after loading the language chunk and highlighting the code.

All source code of this approach is available on GitHub: https://github.com/letmutex/hljs-mdit-code-splitting

There may be some better approaches for this that I don't know, please don't hesitate to let me know by leaving an issue in the GitHub repo.

That’s it! A long blog for a tiny thing. I hope this will give some help when combining dynamic import, highlight.js, and markdown-it.

©2024 letmutex.com