Creating excerpts in Astro

Chen Hui Jing - Mar 14 - - Dev Community

This blog is running on Hugo. It had previously been running on Jekyll. Both these SSGs ship with the ability to create excerpts from your markdown content in 1 line or thereabouts.

/* Hugo */
{{ .Summary | truncate 130 }}
Enter fullscreen mode Exit fullscreen mode
# Ruby
{{ post.content | markdownify | strip_html | truncatewords: 20 }}
Enter fullscreen mode Exit fullscreen mode

I was mildly surprised Astro does not have a corresponding way to do it. To be fair, there is an integration for it: Post Excerpt component for šŸš€ Astro. And Iā€™m all for keeping the core product streamlined.

But Iā€™m also THAT annoying developer that likes to keep the dependency count as low as I can. Which means Iā€™m constantly playing this balance game in my head of building the feature myself versus installing it.

I usually give myself an hour or so, and if I feel itā€™s going to take me more than an hour, Iā€™ll fold.

In this particular case, Iā€™m like, I just need to access the post content, right? That has to exist already, right?

We-ll, kind of?

Someone else already did it!

Cool kids use ChatGPT for all the things now. But Iā€™m not cool. And not young. So I still Google my shit. Which brought me to Paul Scanlonā€˜s post How to Create Excerpts With Astro.

Paulā€™s website is pretty. Go visit Paulā€™s website.

The gist of the self-rolled solution is:

  1. Grab the post.body, which is in Markdown
  2. Parse it into HTML using markdown-it
  3. Extract usable text content from the HTML
  4. Cut off the text to whatever length youā€™d like
  5. Use excerpt and profit

I tried Paulā€™s instructions exactly, but it didnā€™t quite work out for my particular use-case. Because I had articles where the <figure> and <img> tag show up very early. And that somehow got parsed into the excerpt.

My solution deviates at step 3. Because Iā€™m not a Computer Science major. I am not well-versed in the art of regex and parsing. Therefore, I cede the responsibility to a professional: html-to-text. Who am I to doubt more than 1 million downloads a week?

Same but different

If it isnā€™t broke, donā€™t fix it. So I used a similar implementation strategy as Paul. The source of the script goes into an utils folder and I import it into the layout file that needs it.

The script itself isnā€™t rocket science. No, I did not use Typescript for this. Donā€™t @ me.

import MarkdownIt from "markdown-it";
import { convert } from "html-to-text";

const parser = new MarkdownIt();
export const createExcerpt = (body) => {
  const html = parser.render(body);
  const options = {
    wordwrap: null,
    selectors: [
      { selector: "a", options: { ignoreHref: true } },
      { selector: "img", format: "skip" },
      { selector: "figure", format: "skip" },
    ],
  };
  const text = convert(html, options);
  const distilled = convert(text, options);
  return distilled;
};
Enter fullscreen mode Exit fullscreen mode

The fun part is distilled. Why on earth would I run convert() twice? That took me the better part of the hour to figure out.

At first I thought I wasnā€™t configuring the options correctly, but after reading the documentation and this issue, I realised it was more likely a source issue.

After a couple rounds of console.log, I realised that the first parse from Markdown to HTML sanitised the <figure> and <img> tags to &lt;figure&gt; and &lt;img&gt; because they were wrapped in a <p> tag.

So the first convert() returned all the text content plus these tags. Thatā€™s why a second round is needed to clean out these caused-by-sanitisation tags.

Naming things is hard. I just called it distilled. Because you distil booze multiple times.

Actual usage on the [...page].astro file looks something like this:

import { createExcerpt } from '../../utils/create-excerpt';
---
<ol class="postlist">
{((page as any).data || []).map((blogPostEntry: any) => {
  const excerpt = `${createExcerpt(blogPostEntry.body).substring(0, 300)}...`;
return (
    <li class="postlist-item">
      <a href={`${blogPostEntry.slug}`} class="postlist-link heading--6">{blogPostEntry.data.title}</a>
      <p>{blogPostEntry.data.description ? blogPostEntry.data.description : excerpt}</p>
    </li>
  )}
)}
</ol>
Enter fullscreen mode Exit fullscreen mode

Wrapping up

Was it worth the effort to roll this feature myself? I do think so. The code wasnā€™t complicated. And yes, I succumbed to installing 2 parsers. What can I say, Iā€™m not a rational human being. 乁(ā  ā ā€¢ā _ā ā€¢ā  ā )意

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .