Unfurl

A metadata scraper with support for oEmbed, Twitter Cards and Open Graph Protocol for Node.js (>=v8.0.0).

Note: Will not work in the Browser

The what

Unfurl (spread out from a furled state) will take a url and some options, fetch the url, extract the metadata we care about and format the result in a sane way. It supports all major metadata providers and expanding it to work for any others should be trivial.

The why

So you know when you link to something on Slack, or Facebook, or Twitter - they typically show a preview of the link. To do so they have crawled the linked website for metadata and enriched the link by providing more context about it. Which usually entails grabbing its title, description and image/player embed.

The how

npm install unfurl.js

`unfurl(url [, opts])`

url - `string`

opts - `object` of:

oembed?: boolean - support retrieving oembed metadata
timeout? number - req/res timeout in ms, it resets on redirect. 0 to disable (OS limit applies)
follow?: number - maximum redirect count. 0 to not follow redirect
compress?: boolean - support gzip/deflate content encoding
size?: number - maximum response body size in bytes. 0 to disable
headers?: Headers | Record<string, string> | Iterable<readonly [string, string]> | Iterable<Iterable<string>> - map of request headers, overrides the defaults

Default headers:

{
  'Accept': 'text/html, application/xhtml+xml',
  'User-Agent': 'facebookexternalhit'
}

import { unfurl } from 'unfurl.js'
const result = unfurl('https://github.com/trending')

result is `<Promise<Metadata>>`

type Metadata = {
  title?: string
  description?: string
  keywords?: string[]
  favicon?: string
  author?: string
  theme_color?: string
  canonical_url?: string
  oEmbed?: OEmbedPhoto | OEmbedVideo | OEmbedLink | OEmbedRich
  twitter_card: {
    card: string
    site?: string
    creator?: string
    creator_id?: string
    title?: string
    description?: string
    players?: {
      url: string
      stream?: string
      height?: number
      width?: number
    }[]
    apps: {
      iphone: {
        id: string
        name: string
        url: string
      }
      ipad: {
        id: string
        name: string
        url: string
      }
      googleplay: {
        id: string
        name: string
        url: string
      }
    }
    images: {
      url: string
      alt: string
    }[]
  }
  open_graph: {
    title: string
    type: string
    images?: {
      url: string
      secure_url?: string
      type: string
      width: number
      height: number
      alt?: string
    }[]
    url?: string
    audio?: {
      url: string
      secure_url?: string
      type: string
    }[]
    description?: string
    determiner?: string
    site_name?: string
    locale: string
    locale_alt: string
    videos: {
      url: string
      stream?: string
      height?: number
      width?: number
      tags?: string[]
    }[]
    article: {
      published_time?: string
      modified_time?: string
      expiration_time?: string
      author?: string
      section?: string
      tags?: string[]
    }
  }
}

type OEmbedBase = {
  type: "photo" | "video" | "link" | "rich"
  version: string
  title?: string
  author_name?: string
  author_url?: string
  provider_name?: string
  provider_url?: string
  cache_age?: number
  thumbnails?: [
    {
      url?: string
      width?: number
      height?: number
    }
  ]
}

type OEmbedPhoto = OEmbedBase & {
  type: "photo"
  url: string
  width: number
  height: number
}

type OEmbedVideo = OEmbedBase & {
  type: "video"
  html: string
  width: number
  height: number
}

type OEmbedLink = OEmbedBase & {
  type: "link"
}

type OEmbedRich = OEmbedBase & {
  type: "rich"
  html: string
  width: number
  height: number
}

The who 💖

(If you use unfurl.js too feel free to add your project)

vapid/vapid - A template-driven content management system
beeman/micro-unfurl - small microservice that unfurls a URL and returns the OpenGraph meta data.
probot/unfurl - a GitHub App built with probot that unfurls links on Issues and Pull Request discussions

Name		Name	Last commit message	Last commit date
Latest commit History 289 Commits
.github		.github
src		src
test		test
.eslintignore		.eslintignore
.eslintrc		.eslintrc
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
example.js		example.js
jest.config.js		jest.config.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unfurl

The what

The why

The how

`unfurl(url [, opts])`

url - `string`

opts - `object` of:

result is `<Promise<Metadata>>`

The who 💖

About

Releases 23

Sponsor this project

Packages

Contributors 22

Languages

License

jacktuck/unfurl

Folders and files

Latest commit

History

Repository files navigation

Unfurl

The what

The why

The how

unfurl(url [, opts])

url - string

opts - object of:

result is <Promise<Metadata>>

The who 💖

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 23

Sponsor this project

Packages 0

Contributors 22

Languages

`unfurl(url [, opts])`

url - `string`

opts - `object` of:

result is `<Promise<Metadata>>`

Packages