Let's make a DEV.to CLI... together

JavaScript Joel - Oct 15 '18 - - Dev Community

For hacktoberfest I'm gonna make a CLI for DEV.to... Let's make it together!

devto CLI demo

This is meant to be a follow along type tutorial... so follow along. But if you think you are too good to learn something cool, you can just skip to the end.

If I skip over something too quickly and you want more explanation, ask me in the comments!

Setup

Since I'm the one doing the driving, I get the pick the language. I'll be using MojiScript (of course).


git clone https://github.com/joelnet/mojiscript-starter-app.git devto-cli
cd devto-cli
npm ci

There isn't an API for DEV.to. And what happens to all sites that don't have an API? They get scraped!

# install axios
npm install --save-prod axios

Add the axios dependency to index.mjs

import log from 'mojiscript/console/log'
import run from 'mojiscript/core/run'
import axios from 'mojiscript/net/axios'
import main from './main'

const dependencies = {
  axios,
  log
}

run ({ dependencies, main })

Create src/api.mjs

Create a new file src/api.mjs to contain our scraping API. We are using mojiscript/net/axios, which is a curried version of axios.

import pipe from 'mojiscript/core/pipe'

const getData = response => response.data

export const getUrl = axios => pipe ([
  url => axios.get (url) ({}),
  getData
])

export const getDevToHtml = axios => pipe ([
  () => getUrl (axios) ('https://dev.to')
])

Import getDevToHtml into main.mjs

import pipe from 'mojiscript/core/pipe'
import { getDevToHtml } from './api'

const main = ({ axios, log }) => pipe ([
  getDevToHtml (axios),
  log
])

export default main

Now run the code:

npm start

If everything is successful, you should see a bunch of HTML flood the console.

JavaScript interop

Now I don't want to slam DEV.to with HTTP calls every time I debug my code, so let's cache that output to a file.

# this will get you the same version in this tutorial
curl -Lo devto.html https://raw.githubusercontent.com/joelnet/devto-cli/master/devto.html

Next I'm gonna create a file interop/fs.mjs, which is where fs.readFile will be. I place this in an interop folder because this is where MojiScript requires JavaScript interop files to be placed. JavaScript is written differently than MojiScript and is sometimes incompatible (unless inside the interop directory).

To make fs.readFile compatible with MojiScript, I need to first promisify it.

promisify (fs.readFile)

Now that it's promisified, I also need to curry it.

export const readFile = curry (2) (promisify (fs.readFile))

I'm also dealing with UTF8, so let's add a helper to make life easier.

export const readUtf8File = file => readFile (file) ('utf8')

And the full interop/fs.mjs:

import fs from 'fs'
import curry from 'mojiscript/function/curry'
import { promisify } from 'util'

export const readFile = curry (2) (promisify (fs.readFile))

export const readUtf8File = file => readFile (file) ('utf8')

Read the cache

Inside of src/mocks/axios.mock.mjs, I'm going to create mockAxios. That will return the contents of our file when get is called.

import pipe from 'mojiscript/core/pipe'
import { readUtf8File } from '../interop/fs'

const mockAxios = {
  get: () => pipe ([
    () => readUtf8File ('devto.html'),
    data => ({ data })
  ])
}

export default mockAxios

Using the mock is easy. All I have to do is change the dependencies. Nothing in main.mjs needs to change!

// don't forget to add the import!
import mockAxios from './mocks/axios.mock'

const dependencies = {
  axios: mockAxios,
  log
}

Now when we run npm start no HTTP requests are being made. This is good because I am probably gonna run npm start a whole bunch before I complete this thing!

Parsing the HTML

I like cheerio for parsing. I'm pretty sure this is what the cool kids are using.

npm install --save-prod cheerio

create another interop interop/cheerio.mjs.

import cheerio from 'cheerio';
import pipe from 'mojiscript/core/pipe';
import map from 'mojiscript/list/map';

export const getElements = selector => pipe ([
  cheerio.load,
  $ => $ (selector),
  $articles => $articles.toArray (),
  map (cheerio)
])

note: When cheerio's toArray is called, the elements lose all those nice cheerio methods. So we have to map cheerio back onto all the elements.

Next add getElements to main.

import { getElements } from './interop/cheerio'

const main = ({ axios, log }) => pipe ([
  getDevToHtml (axios),
  getElements ('.single-article:not(.feed-cta)'),
  log
])

Run npm start again to see the Array of elements.

npm install --save-prod reselect nothis

Create interop/parser.mjs. I'm gonna use reselect to select the attributes I need from the HTML. I'm not really gonna go into detail about this. It's basically just doing a whole bunch of gets from an element. The code is easy to read, you can also skip it, it's not important.

import reselect from 'reselect'
import nothis from 'nothis'

const { createSelector } = reselect
const isTextNode = nothis(({ nodeType }) => nodeType === 3)

const parseUrl = element => `http://dev.to${element.find('a.index-article-link').attr('href')}`
const parseTitle = element => element.find('h3').contents().filter(isTextNode).text().trim()
const parseUserName = element => element.find('.featured-user-name,h4').text().trim().split('・')[0]
const parseTags = element => element.find('.featured-tags a,.tags a').text().substr(1).split('#')
const parseComments = element => element.find('.comments-count .engagement-count-number').text().trim() || '0'
const parseReactions = element => element.find('.reactions-count .engagement-count-number').text().trim() || '0'

export const parseElement = createSelector(
  parseUrl,
  parseTitle,
  parseUserName,
  parseTags,
  parseComments,
  parseReactions,
  (url, title, username, tags, comments, reactions) => ({
    url,
    title,
    username,
    tags,
    comments,
    reactions
  })
)

Add parseElement to main.

import map from 'mojiscript/list/map'
import { parseElement } from './interop/parser'

const main = ({ axios, log }) => pipe ([
  getDevToHtml (axios),
  getElements ('.single-article:not(.feed-cta)'),
  map (parseElement),
  log,
])

Now when you run npm start you should see something like this:

[
  { url:
     'http://dev.to/ccleary00/how-to-find-the-best-open-source-nodejs-projects-to-study-for-leveling-up-your-skills-1c28',
    title:
     'How to find the best open source Node.js projects to study for leveling up your skills',
    username: 'Corey Cleary',
    tags: [ 'node', 'javascript', 'hacktoberfest' ],
    comments: '0',
    reactions: '33' } ]

Format the data

Add the import, formatPost and add formatPost to main and change log to map (log).

import $ from 'mojiscript/string/template'

const formatPost = $`${'title'}
${'url'}\n#${'tags'}
${'username'} ・ πŸ’–  ${'comments'} πŸ’¬  ${'reactions'}
`

const main = ({ axios, log }) => pipe ([
  getDevToHtml (axios),
  getElements ('.single-article:not(.feed-cta)'),
  map (parseElement),
  map (formatPost),
  map (log)
])

Run npm start again and you should see a handful of records that look like this:

The Introvert's Guide to Professional Development
http://dev.to/geekgalgroks/the-introverts-guide-to-professional-development-3408
#introvert,tips,development,professional
Jenn ・ πŸ’–  1 πŸ’¬  50

Finally, this is starting to look like something!

I am also going to add a conditional in main.mjs to use axios only when production is set in the NODE_ENV.

import ifElse from 'mojiscript/logic/ifElse'

const isProd = env => env === 'production'
const getAxios = () => axios
const getMockAxios = () => mockAxios

const dependencies = {
  axios: ifElse (isProd) (getAxios) (getMockAxios) (process.env.NODE_ENV),
  log
}

Run it with and without production to make sure both are working.

# dev mode
npm start

# production mode
NODE_ENV=production npm start

Viewing the Article

The list is nice and I was planning on stopping the walk through here, but it would be super cool if I could also read the article.

I would like to be able to type something like:

devto read 3408

I notice the url's have an ID on the end that I can use: http://dev.to/geekgalgroks/the-introverts-guide-to-professional-development-3408 <-- right there.

So I'll modify parser.mjs to include a new parser to get that id.

const parseId = createSelector(
  parseUrl,
  url => url.match(/-(\w+)$/, 'i')[1]
)

Then just follow the pattern and parseId into parseElement.

Now the CLI is going to have two branches, one that will display the feed, the other that will show the article. So let's break out our feed logic from main.mjs and into src/showFeed.mjs.

import pipe from 'mojiscript/core/pipe'
import map from 'mojiscript/list/map'
import $ from 'mojiscript/string/template'
import { getDevToHtml } from './api'
import { getElements } from './interop/cheerio'
import { parseElement } from './interop/parser'

const formatPost = $`${'title'}
${'url'}\n#${'tags'}
${'username'} ・ πŸ’–  ${'comments'} πŸ’¬  ${'reactions'}
`

export const shouldShowFeed = args => args.length < 1

export const showFeed = ({ axios, log }) => pipe ([
  getDevToHtml (axios),
  getElements ('.single-article:not(.feed-cta)'),
  map (parseElement),
  map (formatPost),
  map (log)
])

Next, I'm gonna wrap cond around showFeed. It's possible we will have many more branches (maybe help?) in the CLI, but for right now we just have the 1 path.

This is what main.mjs should look like now.

import pipe from 'mojiscript/core/pipe'
import cond from 'mojiscript/logic/cond'
import { showFeed } from './showFeed'

const main = dependencies => pipe ([
  cond ([
    [ () => true, showFeed (dependencies) ]
  ])
])

export default main

We will need access to node's args. So make these changes main.mjs. I am doing a slice on them because the first 2 args are junk args and I don't need them.

// add this line
const state = process.argv.slice (2)

// add state to run
run ({ dependencies, state, main })

Okay we have a lot of work to do before we can actually view the article. So let's add the help. That's something easy.

View the Help

Create src/showHelp.mjs.

import pipe from 'mojiscript/core/pipe'

const helpText = `usage: devto [<command>] [<args>]

  <default>
    Show article feed
  read <id>    Read an article
`

export const showHelp = ({ log }) => pipe ([
  () => log (helpText)
])

Now we can simplify main.mjs and add the new case to cond.

import pipe from 'mojiscript/core/pipe'
import cond from 'mojiscript/logic/cond'
import { shouldShowFeed, showFeed } from './showFeed'
import { showHelp } from './showHelp'

const main = dependencies => pipe ([
  cond ([
    [ shouldShowFeed, showFeed (dependencies) ],
    [ () => true, showHelp (dependencies) ]
  ])
])

export default main

Now if we run npm start -- help, we should see our help:

usage: devto [<command>] [<args>]

  <default>    Show article feed
  read <id>    Read an article

And if we run npm start we should still see our feed!

Article from Cache

The same as I read main feed from cache, I also want to read the article from cache.

curl -Lo article.html https://raw.githubusercontent.com/joelnet/devto-cli/master/article.html

Modify axios.mock.mjs to read the article too.

import pipe from 'mojiscript/core/pipe'
import ifElse from 'mojiscript/logic/ifElse'
import { readUtf8File } from '../interop/fs'

const feedOrArticle = ifElse (url => url === 'https://dev.to') (() => 'devto.html') (() => 'article.html')

const mockAxios = {
  get: url => pipe ([
    () => feedOrArticle (url),
    readUtf8File,
    data => ({ data })
  ])
}

export default mockAxios

Parsing the Article

Parsing the article HTML is much easier because I'm planning on just formatting the whole article-body block as text. So I just need the title and body.

Create interop/articleParser.mjs.

import reselect from 'reselect'

const { createSelector } = reselect

const parseTitle = $ => $('h1').first().text().trim()
const parseBody = $ => $('#article-body').html()

export const parseArticle = createSelector(
  parseTitle,
  parseBody,
  (title, body) => ({
    title,
    body
  })
)

Read the Article

Because there is no state, the CLI will not know what URL to pull when I issue the read command. Because I am lazy, I'll just query the feed again. And pull the URL from the feed.

So I'm gonna hop back into showFeed.mjs and expose that functionality.

I'm just extracting the functions from showFeed and putting them into getArticles. I haven't added any new code here.

export const getArticles = axios => pipe ([
  getDevToHtml (axios),
  getElements ('.single-article:not(.feed-cta)'),
  map (parseElement)
])

export const showFeed = ({ axios, log }) => pipe ([
  getArticles (axios),
  map (formatPost),
  map (log)
])

Show the Article

Now I want to write a function like the one below, but we'll get an error id is not defined. The id is the argument to the pipe, but it's not accessible here. The input to filter is the Array of articles, not the id.

const getArticle = ({ axios }) => pipe ([
  getArticles (axios),
  filter (article => article.id === id), // 'id' is not defined
  articles => articles[0]
])

But there's a trick. Using the W Combinator I can create a closure, so that id is exposed.

const getArticle = ({ axios }) => W (id => pipe ([
  getArticles (axios),
  filter (article => article.id === id),
  articles => articles[0]
]))

Compare that block with the one above it, not much different just add W (id => and a closing ). The W Combinator is an awesome tool. More on Function Combinators in a future article :) For now, let's move on.

All together src/showArticle.mjs should look like this:

import W from 'mojiscript/combinators/W'
import pipe from 'mojiscript/core/pipe'
import filter from 'mojiscript/list/filter'
import { getArticles } from './showFeed'

export const shouldShowArticle = args => args.length === 2 && args[0] === 'read'

const getArticle = ({ axios }) => W (id => pipe ([
  getArticles (axios),
  filter (article => article.id === id),
  articles => articles[0]
]))

export const showArticle = ({ axios, log }) => pipe ([
  getArticle ({ axios }),
  log
])

Modify main.mjs's cond to include the new functions:

import { shouldShowArticle, showArticle } from './showArticle'

const main = dependencies => pipe ([
  cond ([
    [ shouldShowArticle, args => showArticle (dependencies) (args[1]) ],
    [ shouldShowFeed, showFeed (dependencies) ],
    [ () => true, showHelp (dependencies) ]
  ])
])

Run npm run start -- 1i0a (replace id) and you should see something like this:

{ id: '1i0a',
  url:
   'http://dev.to/ppshobi/-email-sending-in-django-2-part--1--1i0a',
  title: 'Email Sending in Django 2, Part -1',
  username: 'Shobi',
  tags: [ 'django', 'emails', 'consoleemailbackend' ],
  comments: '0',
  reactions: '13' }

HTML to Text

I found a great npm packge that look like it'll handle this for me.

npm install --save-prod html-to-text

We have already laid out most of our foundation, so to make an HTTP request, parse the HTML and format it into text, it's as simple as this. Open up showArticle.mjs.

const getArticleTextFromUrl = axios => pipe ([
  ({ url }) => getUrl (axios) (url),
  cheerio.load,
  parseArticle,
  article => `${article.title}\n\n${htmlToText.fromString (article.body)}`
])

I also want to create a view for when the id is not found.

const showArticleNotFound = $`Article ${0} not found.\n`

I'll also create an isArticleFound condition to make the code more readable.

const isArticleFound = article => article != null

I'll use the same W Combinator technique to create a closure and expose id and modify showArticle.

export const showArticle = ({ axios, log }) => W (id => pipe ([
  getArticle ({ axios }),
  ifElse (isArticleFound) (getArticleTextFromUrl (axios)) (() => showArticleNotFound (id)),
  log
]))

All together showArticle.mjs looks like this:

import cheerio from 'cheerio'
import htmlToText from 'html-to-text'
import W from 'mojiscript/combinators/W'
import pipe from 'mojiscript/core/pipe'
import filter from 'mojiscript/list/filter'
import ifElse from 'mojiscript/logic/ifElse'
import $ from 'mojiscript/string/template'
import { getUrl } from './api'
import { parseArticle } from './interop/articleParser'
import { getArticles } from './showFeed'

const isArticleFound = article => article != null
const showArticleNotFound = $`Article ${0} not found.\n`
const getArticleTextFromUrl = axios => pipe ([
  ({ url }) => getUrl (axios) (url),
  cheerio.load,
  parseArticle,
  article => `${article.title}\n\n${htmlToText.fromString (article.body)}`
])

export const shouldShowArticle = args => args.length === 2 && args[0] === 'read'

const getArticle = ({ axios }) => W (id => pipe ([
  getArticles (axios),
  filter (article => article.id === id),
  articles => articles[0]
]))

export const showArticle = ({ axios, log }) => W (id => pipe ([
  getArticle ({ axios }),
  ifElse (isArticleFound) (getArticleTextFromUrl (axios)) (() => showArticleNotFound (id)),
  log
]))

Run npm start -- read 1i0a again and you should see the article!

Minions are cheering

Finishing Touches

I'd like to make the id more clear in the feed.

const formatPost = $`${'id'}・${'title'}
${'url'}\n#${'tags'}
${'username'} ・ πŸ’–  ${'comments'} πŸ’¬  ${'reactions'}
`

Add this to the package.json, I'm gonna name the command devto.

  "bin": {
    "devto": "./src/index.mjs"
  }

In src/index.mjs, add this mystical sorcery at the top:

#!/bin/sh 
':' //# comment; exec /usr/bin/env NODE_ENV=production node --experimental-modules --no-warnings "$0" "$@"

Run this command to create a global link to that command.

npm link

If everything went well, you should now be able to run the following commands:

# get the feed
devto

# read the article
devto read <id>

So you decided to skip to the end?

You can lead the horse to water... or something.

To catch up with the rest of us follow these steps:

# clone the repo
git clone https://github.com/joelnet/devto-cli
cd devto-cli

# install
npm ci
npm run build
npm link

# run
devto

Warnings about the CLI

Scraping websites is a bad idea. When the website changes, which is guaranteed to happen, your code breaks.

This is meant to just be a fun demo for #hacktoberfest and not a maintainable project. If you find a bug, please submit a pull request to fix it along with the bug report. I'm not maintaining this project.

If this was a real project, some things that would be cool:

  • login, so you can read your feed.
  • more interactions, comments, likes, tags. Maybe post an article?

Happy Hacktoberfest!

For those of you that read through the whole thing, thank you for your time. I know this was long. I hope that it was interesting, I hope you learned something and above all, I hope you had fun.

For those of you that actually followed along step by step and created the CLI yourself: You complete me πŸ’–.

Please tell me in the comments or twitter what you learned, what you found interesting or any other comments, or criticisms you may have.

My articles are very Functional JavaScript heavy, if you need more, follow me here, or on Twitter @joelnet!

More articles

Ask me dumb questions about functional programming
Let's talk about auto-generated documentation tools for JavaScript

Cheers!

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .