{"body":{"version":"https://jsonfeed.org/version/1.1","title":"Daniël Illouz","description":"Writing about things I learned and find interesting","home_page_url":"https://www.danillouz.dev","feed_url":"https://www.danillouz.dev/posts.json","favicon":"https://www.danillouz.dev/favicon-32.png","authors":[{"name":"Daniël Illouz"}],"language":"en-US","items":[{"id":"https://www.danillouz.dev/posts/sqlite-cli/","url":"https://www.danillouz.dev/posts/sqlite-cli/","title":"SQLite CLI","summary":"Learning about the SQLite Command Line Interface.","content_html":"
SQLite provides a Command Line Interface (CLI) program named sqlite3
. And it's already installed on most operating systems.
The CLI can be run with or without command line options (flags).
\nWhen a flag is provided, it must be prefixed with -
or --
. For example, -version
and --version
do the same thing:
sqlite3 -version\n
\nWhen sqlite3
is run without flags, it will connect to a temporary in-memory database (which will be deleted on exit) in interactive mode:
sqlite3\n\nSQLite version 3.37.0 2021-12-09 01:34:53\nEnter \".help\" for usage hints.\nConnected to a transient in-memory database.\nUse \".open FILENAME\" to reopen on a persistent database.\n\nsqlite>\n
\nWhen in interactive mode, the prompt is sqlite>
and it reads text input from the keyboard:
.open
(where some dot commands also accept flags).But it's also possible to redirect sqlite3
I/O (input/output) to:
To see how to use the CLI (and print all available CLI flags):
\nsqlite3 -help\n
\nTo print all available dot commands (in interactive mode):
\nsqlite> .help\n
\nTo see how to use a dot command (in interactive mode), and print available dot command flags, run .help DOT_COMMAND
. For example:
sqlite> .help .import\n
\nWhen a filename is provided to the sqlite3
command, it will either create a new database or open an existing database in interactive mode:
sqlite3 mydb\n
\nIn interactive mode, a connection to a new or existing database can always be created via the .open
dot command. And to connect to a temporary in-memory database, use :memory:
as the database file name.
To destroy any data in an existing database run .open -new FILENAME
. For example:
sqlite> .open -new existingdb\n
\nTo open a database in read-only mode use the -readonly
flag:
sqlite3 -readonly mydb\n
\nThis also works in interactive mode:
\nsqlite> .open -readonly myotherdb\n
\nTo see all databases in interactive mode:
\nsqlite> .databases\n
\nTo see all tables (including attached databases) in interactive mode:
\nsqlite> .tables\n
\nTo see all indexes in interactive mode:
\nsqlite> .indexes\nsqlite> .indexes tablename\n
\nTo see the complete schema of the database (including attached databases) in interactive mode:
\nsqlite> .schema\nsqlite> .schema tablename\n
\nIn interactive mode the .read
dot command can be used to read SQL statements (and dot commands) from a file:
sqlite> .read script.sql\n
\nIf the argument to .read
begins with the pipe symbol (|
), then instead of opening the argument as a file, it runs the argument as a command, and uses the output of that command as its input. This can be useful to run scripts that generate SQL.
By default sqlite3
sends all output to \"standard output\", but this can be changed via the .output
and .once
dot commands in interactive mode.
To output all query results to a file:
\nsqlite> .mode list\nsqlite> .separator ,\nsqlite> .output books_and_authors.txt\nsqlite>\nsqlite> SELECT * FROM books;\nsqlite> SELECT * FROM authors;\nsqlite>\nsqlite> .exit\n
\nTo do the above just once, use the .once
dot command instead.
If the argument to .output
or .once
begins with the pipe symbol (|
), then it runs the argument as a command, and the output is sent to that command.
For example:
\nsqlite> .once | open -f\nsqlite> SELECT * FROM books;\n
\nThe readfile()
function loads file content as a BLOB
in interactive mode. For example:
sqlite> CREATE TABLE images(\nsqlite> name TEXT,\nsqlite> type TEXT,\nsqlite> img BLOB\nsqlite> );\nsqlite>\nsqlite> INSERT INTO images(name,type,img)\nsqlite> VALUES('icon','png',readfile('icon.png'));\n
\nThe writefile()
function writes a column value to a file in interactive mode. For example:
sqlite> SELECT writefile('icon.png',img) FROM images WHERE name='icon';\n
\nTo import a CSV file into a table in interactive mode:
\nsqlite> .import -csv file.csv tablename\n
\nAnd to import into a table not part of the \"main\" database the -schema
flag can be used. This specifies that the table is part of another \"schema\" (useful for attached databases or to import into a temporary table).
To export results to a CSV file in interactive mode:
\nsqlite> .headers on\nsqlite> .mode csv\nsqlite> .once ~/data.csv\nsqlite>\nsqlite> SELECT * FROM table;\nsqlite>\nsqlite> .exit\n
\nDump (converts entire database content into a single UTF-8 text file):
\nsqlite3 mydb .dump | gzip -c > mydb.dump.gz\n
\nRestore:
\nzcat mydb.dump.gz | sqlite3 mydb\n
\nAn .sqliterc
resource file can be created in the \"home directory\" to configure dot command settings. For example to change the output format for all queries:
.mode box\n
\nAfter creating the .sqliterc
file, it will be loaded on startup:
sqlite3 mydb\n\n-- Loading resources from /Users/daniel/.sqliterc\nSQLite version 3.37.0 2021-12-09 01:34:53\nEnter \".help\" for usage hints.\n\nsqlite>\n
\nIt's possible to \"bypass\" interactive mode and run SQL statements directly when using the sqlite3
command via the last argument:
sqlite3 mydb \"SELECT * FROM table;\"\n
\nAnd by using CLI flags like -cmd
it's possible to shorten certain actions.
sqlite3 -csv -cmd \".import ~/data.csv data\" :memory: \"SELECT * FROM data;\"\n
\nsqlite3 -csv -header mydb \"SELECT * FROM books;\" > ~/books.csv\n
\n","date_published":"2023-07-15T00:00:00.000Z","date_modified":"2023-07-16T00:00:00.000Z","tags":["cli","sqlite"]},{"id":"https://www.danillouz.dev/posts/caddy-ca/","url":"https://www.danillouz.dev/posts/caddy-ca/","title":"Caddy local CA","summary":"Firefox does not recognize Caddy's local Certificate Authority by default.","content_html":"import { Image } from \"astro:assets\"
\nWhen running Caddy locally, it will also generate its own local Certificate Authority (CA). Caddy will use this CA to sign certificates for local HTTPS.
\nThis is pretty cool! But Caddy's local HTTPS does not work in Firefox by default. When running Caddy on localhost
, Firefox will show the error code SEC_ERROR_UNKNOWN_ISSUER
when visiting https://localhost
(other browsers like Safari don't have this issue).
Turns out that Firefox does not recognize Caddy's local CA by default. And you have to manually import Caddy's local root certificate into Firefox.
\nOpen Firefox and go to about:preferences#privacy
.
Scroll down to the Security > Certificates
section, and click View Certificates
.
Authorities
tab, and click Import
.~/Library/Application\\ Support/Caddy/pki/authorities/local/root.crt
.Trust this CA to identify websites
checkbox, and click OK
.Caddy Local Authority
should now be listed in the Authorities
tab.I recently started using Obsidian and I like it a lot! One thing I was missing though, was to quickly save (i.e. \"clip\") a webpage to Obsidian from my browser. So I was happy to find Stephan Ango's Obsidian web clipper which does just that (thanks Stephan!).
\nStephan's web clipper works pretty well, but I wanted slightly different behavior. And since the web clipper is an open source bookmarklet, it was easy for me to modify.
\nA bookmarklet is a browser bookmark that runs some JavaScript code every time you click it.
\nYou can create a bookmarklet by creating a new bookmark in your browser, but instead of providing a link to a website, you give it a javascript
URI. For example:
javascript: alert(\"Go eat ice cream!\")\n
\nSo the bookmarklet above would show a \"Go eat ice cream!\" alert every time you click it (and I highly recommend you install it).
\nMy version of the bookmarklet is based on Stephan Ango's Obsidian web clipper, so it does pretty much the same thing, but with these differences:
\nClippings
and Clippings/Quotes
.Drag this link to your bookmarks: <a href=\"javascript:(function(){function _getSelection(e){if(void 0===window.getSelection)return{hasSelection:!1,html:\"\",textFragment:\"\"};const n=window.getSelection();if(!n||n.rangeCount<1)return{hasSelection:!1,html:\"\",textFragment:\"\"};const{status:t,fragment:o}=e(n),i=_makeTextFragmentDirective(t,o),r=window.document.createElement(\"div\");for(let e=0,t=n.rangeCount;e<t;++e)r.appendChild(n.getRangeAt(e).cloneContents());const a=r.innerHTML;return{hasSelection:Boolean(a),html:a,textFragment:i}}function _makeTextFragmentDirective(e,n){if(0!==e)return\"\";const t=n.prefix?${encodeURIComponent(n.prefix)}-,
:\"\",o=n.suffix?,-${encodeURIComponent(n.suffix)}
:\"\";return#:~:text=${t}${encodeURIComponent(n.textStart)}${n.textEnd?
,${encodeURIComponent(n.textEnd)}:""}${o}
}function _makeObsidianNoteContent({author:e,body:n,excerpt:t,selection:o,title:i,url:r}){let a=new URL(r);a.search=\"\",a.hash=\"\",a=a.toString();const l=new Date;if(o.hasSelection){return> [!quote] ${l.toLocaleDateString(void 0,{weekday:"short",year:"numeric",month:"short",day:"numeric",hour:"numeric",minute:"numeric"})} &bull; [Source](${a}${o.textFragment})\\n\\n${n}\\n\\n---\\n\\n
}{const o=t!==i?t:\"\",[r]=l.toISOString().split(\"T\"),c=[${i}](${a})
;return---\\naliases:\\nclipping_author: ${e||""}\\nclipping_url: ${a}\\ncreated_at_unix: ${Math.round(Date.now()/1e3)} \\nsummary: ${o}\\n---\\n\\n%%\\ndates:: [[${r}]]\\nrelated::\\n%%\\n\\n#clipping\\n\\n> [!info]\\n> ${c}${e?" by "+e:""}\\n\\n${n}\\n
}}function _makeObsidianUri({config:e,content:n,selection:t,title:o}){const i={content:n,file:${t.hasSelection?e.selectionFolderName:e.folderName}/${o.replace(/:/g,"").replace(/\\//g,"-").replace(/\\\\/g,"-")}
};t.hasSelection&&(i.append=\"true\");returnobsidian://new?${Object.entries(i).map((([e,n])=>
${e}=${encodeURIComponent(n)})).join("&")}
}Promise.all([import(\"https://cdn.jsdelivr.net/npm/@mozilla/readability/+esm\"),import(\"https://cdn.jsdelivr.net/npm/turndown/+esm\"),import(\"https://cdn.jsdelivr.net/npm/text-fragments-polyfill/dist/fragment-generation-utils.js/+esm\"),Promise.resolve({folderName:\"Clippings\",selectionFolderName:\"Clippings/Quotes\"})]).then((([e,n,t,o])=>{const{Readability:i}=e.default,{default:r}=n,{generateFragment:a}=t,l=_getSelection(a),{byline:c,content:s,excerpt:d,title:m}=new i(window.document.cloneNode(!0)).parse(),u=_makeObsidianUri({config:o,content:_makeObsidianNoteContent({author:c,body:new r({headingStyle:\"atx\",hr:\"---\",bulletListMarker:\"-\",codeBlockStyle:\"fenced\"}).turndown(l.html||s),excerpt:d,selection:l,title:m,url:window.document.URL}),selection:l,title:m});window.document.location.href=u})).catch((e=>{alert(\"Failed to clip to Obsidian\\n\\n\"+e+\"\\n\\n(see the browser developer console for more details)\")}));}());\" class=\"button button-bookmarklet svelte-1egmjrd\">Clip to Obsidian</a>.
Visit a webpage:
\na. To clip an entire webpage: click the bookmark.
\nb. To only clip part of a webpage: first select some text (can include images), then click the bookmark.
\nFeel free to remix the code below. And after changing the code, you can turn it into a bookmarklet with Make Bookmarklets.
\n/**\n * Obsidian web clipper (bookmarklet).\n *\n * Based on Stephan Ango's \"Obsidian web clipper\".\n * @see {@link https://stephanango.com/obsidian-web-clipper}\n *\n * Uses jsDelivr to import npm dependencies as ESM modules.\n * @see {@link https://www.jsdelivr.com/?docs=esm}\n *\n * Made into a bookmarklet with \"Make Bookmarklets\".\n * @see {@link https://make-bookmarklets.com/}\n */\nPromise.all([\n // Dependencies.\n import(\"https://cdn.jsdelivr.net/npm/@mozilla/readability/+esm\"),\n import(\"https://cdn.jsdelivr.net/npm/turndown/+esm\"),\n import(\n \"https://cdn.jsdelivr.net/npm/text-fragments-polyfill/dist/fragment-generation-utils.js/+esm\"\n ),\n\n // Config.\n Promise.resolve({\n // Clippings of entire webpages will be stored as separate notes in\n // this Obsidian folder.\n folderName: \"Clippings\",\n\n // Clippings of selections will be stored in this Obsidian folder,\n // where clippings of the same webpage will be appended to the same\n // Obsidian note.\n selectionFolderName: \"Clippings/Quotes\",\n }),\n])\n .then(([readabilityJs, turndownJs, textFragmentsPolyfillJs, config]) => {\n const { Readability } = readabilityJs.default\n const { default: Turndown } = turndownJs\n const { generateFragment } = textFragmentsPolyfillJs\n\n const selection = _getSelection(generateFragment)\n\n /**\n * Readability removes clutter from web pages.\n * It's the same library that's used in Firefox's Reader View.\n * @see {@link https://www.npmjs.com/package/@mozilla/readability}\n */\n const {\n byline: author,\n content,\n excerpt,\n title,\n } = new Readability(window.document.cloneNode(true)).parse()\n\n /**\n * Converts HTML to Markdown.\n * @see {@link https://www.npmjs.com/package/turndown}\n */\n const markdown = new Turndown({\n headingStyle: \"atx\",\n hr: \"---\",\n bulletListMarker: \"-\",\n codeBlockStyle: \"fenced\",\n }).turndown(selection.html || content)\n\n const obsidianContent = _makeObsidianNoteContent({\n author,\n body: markdown,\n excerpt,\n selection,\n title,\n url: window.document.URL,\n })\n\n const obsidianUri = _makeObsidianUri({\n config,\n content: obsidianContent,\n selection,\n title,\n })\n\n window.document.location.href = obsidianUri\n })\n .catch((error) => {\n alert(\n \"Failed to clip to Obsidian\" +\n \"\\n\\n\" +\n error +\n \"\\n\\n\" +\n \"(see the browser developer console for more details)\"\n )\n })\n\nfunction _getSelection(generateFragmentFn) {\n if (typeof window.getSelection === \"undefined\") {\n return {\n hasSelection: false,\n html: \"\",\n textFragment: \"\",\n }\n }\n\n const sel = window.getSelection()\n if (!sel || sel.rangeCount < 1) {\n return {\n hasSelection: false,\n html: \"\",\n textFragment: \"\",\n }\n }\n\n const { status, fragment } = generateFragmentFn(sel)\n const textFragment = _makeTextFragmentDirective(status, fragment)\n const container = window.document.createElement(\"div\")\n for (let i = 0, len = sel.rangeCount; i < len; ++i) {\n container.appendChild(sel.getRangeAt(i).cloneContents())\n }\n const html = container.innerHTML\n return {\n hasSelection: Boolean(html),\n html,\n textFragment,\n }\n}\n\n/**\n * Makes the text fragment directive to highlight a text selection.\n *\n * Only Chromium/Safari browsers support text fragments.\n * @see {@link https://web.dev/text-fragments/}\n *\n * But a browser extension can be installed to polyfill the functionality.\n * @see {@link https://github.com/GoogleChromeLabs/link-to-text-fragment}\n */\nfunction _makeTextFragmentDirective(status, fragment) {\n if (status !== 0) {\n /**\n * Non-0 status means error.\n * @see {@link https://github.com/GoogleChromeLabs/link-to-text-fragment/blob/main/fragment-generation-utils.js#L779}\n */\n return \"\"\n }\n\n const prefix = fragment.prefix\n ? `${encodeURIComponent(fragment.prefix)}-,`\n : \"\"\n const suffix = fragment.suffix\n ? `,-${encodeURIComponent(fragment.suffix)}`\n : \"\"\n const start = encodeURIComponent(fragment.textStart)\n const end = fragment.textEnd ? `,${encodeURIComponent(fragment.textEnd)}` : \"\"\n\n return `#:~:text=${prefix}${start}${end}${suffix}`\n}\n\n/**\n * Makes the Obsidian note content.\n *\n * For webpage clippings only (i.e. not webpage selection clippings):\n *\n * Uses YAML front matter to add metadata about the clipping to the note.\n * @see {@link https://help.obsidian.md/Editing+and+formatting/Metadata}\n *\n * Uses a comment to link to a daily note (you can't link to other notes\n * in the front matter).\n * @see {@link https://help.obsidian.md/Editing+and+formatting/Basic+formatting+syntax#Comments}\n */\nfunction _makeObsidianNoteContent({\n author,\n body,\n excerpt,\n selection,\n title,\n url,\n}) {\n // NOTE: I'm stripping the query/hash params because I just need to\n // link back to the clipping source. But it could be that a page is\n // using those params to show specific content, which will be \"lost\"\n // after stripping. So might have to revisit this..\n let cleanUrl = new URL(url)\n cleanUrl.search = \"\"\n cleanUrl.hash = \"\"\n cleanUrl = cleanUrl.toString()\n\n const now = new Date()\n\n if (selection.hasSelection) {\n /**\n * `locales` is set to `undefined` to use the default locale.\n * @see {@link https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date/toLocaleDateString}\n */\n const prettyDate = now.toLocaleDateString(undefined, {\n weekday: \"short\",\n year: \"numeric\",\n month: \"short\",\n day: \"numeric\",\n hour: \"numeric\",\n minute: \"numeric\",\n })\n return `> [!quote] ${prettyDate} • [Source](${cleanUrl}${selection.textFragment})\n\n${body}\n\n---\n\n`\n } else {\n const summary = excerpt !== title ? excerpt : \"\"\n const [yyyy_mm_dd] = now.toISOString().split(\"T\")\n const titleLink = `[${title}](${cleanUrl})`\n return `---\naliases:\nclipping_author: ${author ? author : \"\"}\nclipping_url: ${cleanUrl}\ncreated_at_unix: ${Math.round(Date.now() / 1000)} \nsummary: ${summary}\n---\n\n%%\ndates:: [[${yyyy_mm_dd}]]\nrelated::\n%%\n\n#clipping\n\n> [!info]\n> ${titleLink}${author ? \" by \" + author : \"\"}\n\n${body}\n`\n }\n}\n\n/**\n * Makes the Obsidian URI to create/append to a note.\n * @see {@link https://help.obsidian.md/Advanced+topics/Using+Obsidian+URI#Action+%60new%60}\n */\nfunction _makeObsidianUri({ config, content, selection, title }) {\n const folderName = selection.hasSelection\n ? config.selectionFolderName\n : config.folderName\n\n // NOTE: characters \":\", \"/\" and \"\\\" are not allowed in file names.\n const fileName = title\n .replace(/:/g, \"\")\n .replace(/\\//g, \"-\")\n .replace(/\\\\/g, \"-\")\n\n const query = {\n content,\n file: `${folderName}/${fileName}`,\n }\n\n if (selection.hasSelection) {\n // NOTE: \"boolean\" params trigger with any truthy value, like\n // `append=false`.\n query.append = \"true\"\n }\n\n // NOTE: URLSearchParams().toString() encoding leads to unexpected\n // behavior, so use `encodeURIComponent()` instead.\n const queryString = Object.entries(query)\n .map(([k, v]) => `${k}=${encodeURIComponent(v)}`)\n .join(\"&\")\n\n return `obsidian://new?${queryString}`\n}\n
\n","date_published":"2023-06-18T00:00:00.000Z","tags":["bookmarklet","obsidian","web-clipper"]},{"id":"https://www.danillouz.dev/posts/example-com/","url":"https://www.danillouz.dev/posts/example-com/","title":"example.com","summary":"IANA reserved domains.","content_html":"There are a few domains that are reserved by IANA. Reserved means that these domains can't be registered by anyone, and can't be transferred. One of them is example.com.
\nSuch domains are sometimes called special use domain names, and the full list can be found here.
\nThe most notable special TLDs are:
\ntest
example
invalid
localhost
RFC 2606 specifies best practices on how to use the special domains. It recommends the following:
\n\n\nThere's also RFC 6761 with more information, like how DNS servers should handle these domains.
\n
Special domains guarantee deterministic behavior in tests and documentation.
\nLet's say I make up a domain for (local) testing, where I expect certain behavior (e.g. it must resolve, or it must fail). It could happen that at some point the domain becomes available, gets registered, and now my test will behave unexpectedly.
\nThis is for example what happened with the dev
TLD! It was sometimes used for testing (locally), but then Google \"bought\" it.
import { Image } from \"astro:assets\"
\nMy understanding of DNS was always pretty basic. But since I started working more with hosting infrastructure, I've learned a lot more about it. I think DNS is really cool, but it is complicated. DNS has a lot of moving parts and terminology you need to know about to really get it. So I decided to write a bit about this. Mostly to capture and solidify my learnings, but maybe it can also be useful to others.
\nIn this post I'll cover what problem DNS solves, what DNS is, and how DNS works when you visit a website in your browser. This post gets a bit technical at times, but I try not to assume any prior knowledge, so you can (hopefully) also follow along if you're new to the topic.
\nThe internet is a massive system of interconnected computer networks. And devices connected to this network communicate with each other by sending \"packets of data\". But to make sure that these packets are routed to the correct destination, a protocol must be followed.
\nWhat's a protocol? It's basically a set of rules that need to be followed to achieve \"something\".
\nFor example, to mail a letter, the protocol is that you must:
\nBut if one of these rules is broken (e.g. because phone numbers were used for rule 2 above) the letter will not be delivered.
\nIt's a bit like that on the internet, but the protocol that's used is called the Internet Protocol (IP). And instead of using mail addresses to deliver mail to the correct destination, IP addresses must be used to deliver packets of data to the correct destination[^1].
\n[^1]: The IP protocol is basically the addressing system of the internet, but there's more needed to deliver packets from source to destination. The exact details are out of scope for this post, but there's also a transport protocol needed to define rules how data is sent and received. Ultimately there are multiple protocols needed which are \"layered\" on top of each other, like TCP/IP.
\nIP addresses are unique identifiers. For example, if a device wants to visit this website it must (at the time of this writing) go to the IP address 76.76.21.21
[^2].
[^2]: This is an IPv4 address, and IP version 4 has been around since 1983. It works great, but we're running out of unique IPv4 addresses because nowadays even toasters must connect to the internet. This is where IPv6 comes in: IPv6 uses more characters to make sure all toasters all covered! For example, 2606:4700::6810:84e5
is an IPv6 address. But IPv6 is not completely adopted yet, so it's still common to use IPv4.
IP addresses work great for machines and robots because they love numbers. But us humans usually have difficulty remembering them, and we prefer using a more memorable domain name instead.
\nBut on the internet IP addresses must be used, so how can you for example type a domain name in a web browser, and somehow still end up at the correct IP address? Well, this is the main problem that DNS solves: DNS can look up the IP address of a domain name.
\nPractically speaking, DNS is like a phone book[^3] for the internet.
\n[^3]: In case you don't know, a phone book is literally a book of phone numbers And a long time ago, they were used to find a phone number for a person or business when you only knew their name. (Yes, people would actually call each other!)
\nTechnically speaking, DNS is a distributed naming system that consists of many servers spread across the globe.
\nI like to think about DNS as a very large partitioned database that organizes, stores, and retrieves information about domain names. To do all of this, DNS has the following main components:
\nThe domain name space is a conceptual model that organizes all domain names on the internet, and it can be visualized as a hierarchical structure that looks like a tree.
\nThis hierarchy is reflected in domain names themselves:
\n.
(dot) is called a label..
(dot), also called the root domain[^4].[^4]: The root domain is typically not specified. For example, you'd usually type github.com
in your browser instead of github.com.
(note the trailing dot). But you can absolutely do this! And when you do explicitly provide the root, the domain name is referred to as a Fully Qualified Domain Name (FQDN).
For example, the labels of the domain names:
\nwww.framer.com
github.com
www.danillouz.dev
en.wikipedia.org
Can be visualized in the domain name space like this:
\n\nBy following the tree of the domain name space from top-to-bottom, the labels of a domain name go from most generic (.
) to most specific (e.g. www
). And depending on what \"level\" these labels sit in the tree, they are referred to differently.
When reading a domain name from left-to-right:
\ncom
or org
.uk
or nl
.[^5]: There are 6 types of TLDs: country code (ccTLD), generic (gTLD), generic restricted (grTLD), infrastructure (ARPA), sponsored (sTLD), and test (tTLD) top-level domains.
\nFor example, for the domain name www.bbc.co.uk
:
uk
is the TLD (ccTLD).co
is the 2LD.bbc
is the 3LD.www
is the 4LD.Each label in the domain name space will usually have some information associated with it (e.g. an IP address). This information is stored in text files called resource records (usually called DNS records), and DNS servers that store resource records are called name servers.
\nThere are different kind of resource records, and I won't cover all of them in this post, but 3 important ones are:
\nName servers are grouped together into DNS zones and each zone has an operator: an organization responsible for managing a specific part of the domain name space.
\nDNS zones usually don't map to domain names or DNS servers exactly, so they can be a bit ambiguous. But they will usually map to level(s) of the domain name space tree (like the root zone, but more on that later). This means that zones (i.e. name servers) only store parts of the information in the domain name space.
\nI like to think about DNS zones as partitions of the entire database. DNS needs to store a lot of information (and make it globally available), so it splits up its database into zones. And to make sure the system as a whole scales and runs reliably, each zone has an operator that's responsible for it.
\nName servers only store part of the domain name space, so how can DNS retrieve information for every name in the domain name space? Well, most name servers just point to other name servers, and its up to a different kind of DNS server called a resolver (also called a recursor) to follow these \"pointers\" and retrieve resource records.
\nSo far we've covered the main components of DNS, but to understand how it works we first need to explicitly identify the different kind of DNS servers and how they interact with each other.
\nThere are 4 different kind of servers needed to make DNS work:
\n[^6]: There are 13 clusters of hundreds of physical DNS root servers, distributed all over the globe. And you can see them (and their location) on root-servers.org.\n[^7]: But you can change this in the network settings of your operating system and use a different resolver, like Cloudflare's 1.1.1.1 or Google's 8.8.8.8.
\n\n\nI used to be really confused about what authoritative name servers are, and how they differ from other name servers. But an authoritative name server is just the name server that \"knows\" the information being queried by a resolver. So it actually depends on the query type which name server is authoritative. For example, root name servers are authoritative for the root zone, TLD name servers are authoritative for a TLD zone, and when querying the A record for a domain name, the name server that stores the IPv4 address is authoritative.
\n
With that covered, we can finally answer the question below.
\nThe following occurs when a browser uses DNS to look up the IP address of a domain name:
\n\n\nNote that the steps above happen for uncached queries. Since there can be a lot steps needed to look up information for a domain name, resolvers will cache the results of queries. So when a query is made for a domain name that was recently looked up, the resolver can skip (some of) the steps above and return the cached result immediately. Caching can happen at every step above, on the name servers, resolver, on the browser and operating system.
\n
We now know that DNS is basically a very large database that's split up into zones, and that zones are managed by operators. But how do operators work together? How do operators know about changes that occur in the domain name space (like when a new domain name is registered)? And who oversees all of this?
\nICANN (Internet Corporation for Assigned Names and Numbers) and IANA (Internet Assigned Numbers Authority) are 2 organizations that help provide stability and consistency on the internet.
\nICANN helps with administration, oversight and maintenance. But delegates some of this to IANA (which is part of ICANN).
\nFor example, ICANN helps make technical decisions on the internet, coordinates adding new TLDs, and operates 1 of the 13 DNS root name servers. While IANA maintains what protocols are used on the internet, coordinates IP addresses globally, and manages the DNS root zone.
\nBesides the root zone database managed by IANA, there are also organizations that manage a database of all 2LDs with a specific TLD. These organizations are called registry operators[^8]. Strictly speaking, the databases they maintain are called registries, but often the operator itself will also be referred to as the registry.
\n[^8]: Registry operators (or registries) are sometimes also called a Network Information Center (NIC).
\nThis means each TLD has a registry. For example, Verisign is the registry for .com
domain names, and Google Registry is the registry for .dev
domain names. Verisign and Google actually manage multiple TLDs, but there are also registries that manage a single TLD.
So how do these registries know about (new) domain names? Well, some registries allow you to directly register a domain name with them. But most registries will partner with a different organization called a domain name registrar.
\nDomain name registrars are companies that allow you to register domain names by paying them a fee. When you register a domain name, you don't actually buy the domain name. But you will hold the \"right\" to use it for a specific amount of time. You then become the registrant of the domain name and will be considered the \"owner\" of it.
\nRegistries allow registrars to partner with them by entering a Registry-Registrar Agreement. But in order to do so, the registrar must meet the requirements[^9] set by the registry (and ICANN). After the agreement is in place, the registrar may offer their customers to register domain names for the specific TLD(s). And every time a domain name is registered, renewed, transferred, or expires, the registrar will notify[^10] the registry--where for some operations registrars also pay registries (and ICANN) a fee[^11].
\n[^9]: These requirements can differ per registry (and some make them available online). For example, most registries require the registrar to be accredited by ICANN. And sometimes registries even set rules that affect which registrants may register a domain name for their TLD (e.g. only US governments may register a .gov
domain name).\n[^10]: Registrars usually use the Extensible Provisioning Protocol (EPP) to interact with registries.\n[^11]: The registrar must pay fees every time a domain name is registered, renewed or transferred. There's the registry fee (as defined in the Registry-Registrant Agreement). And the $0.18 ICANN fee. But there might also be other fees, like a yearly fee of $4000 when the registry is ICANN accredited.
That's it for now! Everything covered in this post is pretty theoretical, and one way to see DNS in action is to query resource records with dig. But I'll save that for another post.
\nBy the way, these are some resources I used to learn more about DNS:
\n(and let me know if you have any good ones to add!)
\n","date_published":"2023-06-03T00:00:00.000Z","tags":["dns","dns-zones","domain-names","iana","icann","internet","ip","name-servers","registrars","registries","resolvers","resource-records","subdomains","tld"]},{"id":"https://www.danillouz.dev/posts/mastodon-alias/","url":"https://www.danillouz.dev/posts/mastodon-alias/","title":"Aliasing your Mastodon handle","summary":"Using a custom domain to alias your Mastodon handle.","content_html":"import { Image } from \"astro:assets\"
\nI'm not very active on social media, but I recently created a Mastodon account.
\nI'm still learning about the fediverse. So I was reading the docs a bit, and that's when I stumbled upon WebFinger.
\nI never heard of it before, but Mastodon uses WebFinger to figure out the location of an account. So it can for example resolve the account danillouz@mastodon.social
to the location https://mastodon.social/@danillouz
.
This location information is returned by a WebFinger endpoint. And this made me wonder. Could my site, hosted on a custom domain, return this information as well, so that I could use my custom domain as an \"alias\" for my Mastodon handle? Turns out you can! But there are some caveats.
\nWebFinger is a protocol[^1] that allows information about people or entities to be discovered over HTTP. It basically resolves some sort of URI identifier (like an email address, Mastodon account, or phone number) to a location (i.e. an URL), which can be retrieved by making a WebFinger request.
\n[^1]: RFC 7033 describes the WebFinger protocol.
\nA WebFinger request is an HTTP GET
request to a resource. The resource is a well-known URI with a query target. And the query target identifies the entity to get the location for, which is specified via the ?resource=
query parameter in the request. The endpoint then returns the location information as JSON.
For example, to get WebFinger information for the Mastodon account danillouz@mastodon.social
[^2], you need to make the following request:
[^2]: Mastodon uses the acct:
URI scheme as described in RFC 7565.
GET /.well-known/webfinger?resource=acct:danillouz@mastodon.social\nHOST: mastodon.social\n
\n200 OK\nContent-Type: application/json\n\n{\n \"subject\": \"acct:danillouz@mastodon.social\",\n \"aliases\": [\n \"https://mastodon.social/@danillouz\",\n \"https://mastodon.social/users/danillouz\"\n ],\n \"links\": [\n {\n \"rel\": \"http://webfinger.net/rel/profile-page\",\n \"type\": \"text/html\",\n \"href\": \"https://mastodon.social/@danillouz\"\n },\n {\n \"rel\": \"self\",\n \"type\": \"application/activity+json\",\n \"href\": \"https://mastodon.social/users/danillouz\"\n },\n {\n \"rel\": \"http://ostatus.org/schema/1.0/subscribe\",\n \"template\": \"https://mastodon.social/authorize_interaction?uri={uri}\"\n }\n ]\n}\n
\nYou can replace the Mastodon domain and username with your own, to get your information instead:
\nhttps://{MASTODON_DOMAIN}/.well-known/webfinger?resource=acct:{MASTODON_USERNAME}\n
\nOn Mastodon, users have accounts on different servers. Like mastodon.social or mas.to. So even though the handles danillouz@mastodon.social
and danillouz@mas.to
share the same \"local\" username danillouz
, they are different accounts.
And from what I understand, Mastodon's internal implementation can't just use the account handle. It requires the location (provided by WebFinger) to convert an account to a user on its server for things like mentions and search to work.
\nThe RFC mentions that WebFinger information is static:
\n<blockquote>\n<p>The information is intended to be static in nature, and, as such, WebFinger is not intended to be used to return dynamic information like the temperature of a CPU or the current toner level in a laser printer.</p>
\n<cite>\n<p>RFC 7033: Introduction</p>\n</cite>\n</blockquote>
\nSo if you can host some static JSON on your custom domain, you can add a WebFinger endpoint.
\nYou can do this by:
\nGET
request is made to /.well-known/webfinger?resource=acct:{MASTODON_USERNAME}
on your custom domain.[^3]: The WebFinger RFC mentions that the Content-Type
of a WebFinger response should be application/jrd+json
. But it looks like using application/json
also works.
With that in place, your custom domain can be used to find your Mastodon account.
\nI'm using Astro, so I just added a static file endpoint:
\nimport type { APIRoute } from \"astro\"\n\nconst MASTODON_USERNAME = \"danillouz\"\nconst MASTODON_DOMAIN = \"mastodon.social\"\n\nexport const GET: APIRoute = async function ({ params, request }) {\n return new Response(\n JSON.stringify({\n body: JSON.stringify({\n subject: `acct:${MASTODON_USERNAME}@${MASTODON_DOMAIN}`,\n aliases: [\n `https://${MASTODON_DOMAIN}/@${MASTODON_USERNAME}`,\n `https://${MASTODON_DOMAIN}/users/${MASTODON_USERNAME}`,\n ],\n links: [\n {\n rel: \"http://webfinger.net/rel/profile-page\",\n type: \"text/html\",\n href: `https://${MASTODON_DOMAIN}/@${MASTODON_USERNAME}`,\n },\n {\n rel: \"self\",\n type: \"application/activity+json\",\n href: `https://${MASTODON_DOMAIN}/users/${MASTODON_USERNAME}`,\n },\n {\n rel: \"http://ostatus.org/schema/1.0/subscribe\",\n template: `https://${MASTODON_DOMAIN}/authorize_interaction?uri={uri}`,\n },\n ],\n }),\n })\n )\n}\n
\nNote that the static JSON endpoint I added will only serve the WebFinger information when making the request:
\nGET /.well-known/webfinger.json\nHost: www.danillouz.dev\n
\nBut Mastodon will actually make the following request:
\nGET /.well-known/webfinger?resource=acct:danillouz@mastodon.social\nHost: www.danillouz.dev\n
\nSince I have just one Mastodon account, I chose to just ignore the ?resource=
query parameter, and redirect all requests from /.well-known/webfinger
to /.well-known/webfinger.json
.
I'm using Vercel, which supports redirects. So I can achieve the desired redirect by adding the following rule:
\n{\n \"redirects\": [\n {\n \"source\": \"/.well-known/webfinger\",\n \"destination\": \"/.well-known/webfinger.json\"\n }\n ]\n}\n
\nNow that my custom domain has a WebFinger endpoint, I can find my Mastodon account by using my custom domain!
\nFor example, searching for hi@danillouz.dev
will now give me a hit.
I'm not sure to be honest.
\nLike mentioned before, Mastodon is a bit different, where an account handle consists of two parts:
\ndanillouz
.mastodon.social
.And the docs mention that you should include the server domain when sharing your handle with other people, because otherwise they won't be able to find you easily:
\n<blockquote>\n<p>Mastodon allows you to skip the second part when addressing people on the same server as you, but you have to keep in mind when sharing your username with other people, you need to include the domain or they won't be able to find you as easily.</p>
\n<cite>\n<p>Mastodon docs: Your username and your domain</p>\n</cite>\n</blockquote>
\nSo in theory, setting up an alias allows you to create a handle that does not change when migrating to a different Mastodon server. And it might make your account easier to find if people know your custom domain.
\nIt's also pretty cool that with the alias you can have a Mastodon handle that includes your custom domain without needing to host your own Mastodon server. But practically speaking, searching for just the local username on different servers also works as far as I can tell.
\nWhen the docs mentioned that you should include the server domain when sharing your handle (because otherwise people won't be able to find you easily) I thought this meant that someone would always have to search for danillouz@mastodon.social
on servers other than mastodon.social
to find me. But this doesn't appears to be the case. For example, I can search for danillouz
on mast.to
, and it will find me.
So maybe, aliasing your handle isn't really a good idea?
\nI'm not sure if having an \"extra\" WebFinger endpoint can actually break stuff (can information become stale?). But there are some caveats when using your custom domain as an alias to be aware of.
\nThere might be more, but these are the ones I encountered.
\nI created my account on mastodon.social
, and there I can find my account when searching for the alias without problems. But when I tried finding my account using the alias on a different server, I was surprised there was no result.
Turns out that when you're not signed in to a server, the search API will not use WebFinger to resolve the handle!
\nThis is how the search request looks like when I'm signed in:
\nGET /api/v2/search?q=hi@danillouz.dev&resolve=true\nHost: mastodon.social\n
\nAnd this is how the same search request looks like when I'm signed out:
\nGET /api/v2/search?q=hi@danillouz.dev&resolve=false\nHost: mastodon.social\n
\nThe difference is that the query parameter resolve
is set to true
when signed in. But is set to false
when signed out.
And checking the v2 search API docs, we can see that resolve
controls if a WebFinger lookup should happen or not:
<blockquote>\n<p>Boolean. Attempt WebFinger lookup? Defaults to false.</p>
\n<cite>\n<p>Mastodon docs: Perform a search</p>\n</cite>\n</blockquote>
\nSince I'm redirecting WebFinger requests, I'm returning the same response for all acct:
queries. So any[^4] local username can be provided together with my custom domain.
[^4]: Sadly, using emoji doesn't work though.
\nFor example, these all work:
\nhey@danillouz.dev
737@danillouz.dev
lol@danillouz.dev
It was fun to learn a bit more about Mastodon's internals. And while searching around how Mastodon uses WebFinger, I saw others had the same idea to alias their handle. Like Maarten Balliauw and Lindsay Wardell.
\nI think the latter post is pretty cool, where Lindsay is fetching Mastodon posts via RSS to show them on their site. I learned that you can postfix any account or tag with .rss
, and Mastodon will give you the RSS feed for it! This reminded me of the Reddit API. So I tried postfixing with .json
, and that also works[^5].
For example:
\n[^5]: But as far as I can tell, you won't get the posts in JSON for an account.
\nI especially like the RSS feed functionality, since that allows me to subscribe to accounts and tags from my favourite RSS reader[^6]!
\n[^6]: If you're not familiar with RSS, have a look at aboutfeeds.
\n","date_published":"2023-01-03T00:00:00.000Z","date_modified":"2023-02-03T00:00:00.000Z","tags":["astro","mastodon","vercel","webfinger"]},{"id":"https://www.danillouz.dev/posts/go-handlers/","url":"https://www.danillouz.dev/posts/go-handlers/","title":"Handling Go handlers","summary":"Learning about the HTTP request multiplexer, handlers and middleware in Go.","content_html":"I recently had to hook-up some middleware in a Go service. And while looking into the Go standard library net/http package, I got a bit confused by all the different (but similarly named) types and functions that deal with HTTP handlers.
\nFor example, the http.Handler
and http.HandlerFunc
types. The http.Handle()
and http.HandleFunc()
functions. And the http.ServeMux
type that also defines Handle()
and HandleFunc()
methods.
At first I didn't really get the difference. And I didn't understand why middleware in Go is typically a function that accepts and returns an http.Handler
. But after some (re)reading and experimentation, it all made sense!
This is what I learned.
\nIn a web server we'd typically have handlers that respond to HTTP requests. And routers that map URL patterns to handlers. But how are these exposed via the standard library?
\nThe net/http
package exposes the http.Handler interface:
type Handler interface {\n\tServeHTTP(ResponseWriter, *Request)\n}\n
\nAnd any type that satisfies the http.Handler
interface can be used as a handler. Or in other words, any type that implements the ServeHTTP(ResponseWriter, *Request)
method can be used to respond to HTTP requests.
As far as I know, the standard library doesn't use the term \"router\". It uses the term HTTP request multiplexer instead. But they are essentially the same thing.
\nThe multiplexer matches the URL path of an incoming request against registered patterns, and calls the handler for the pattern that most closely matches the URL. The standard library exposes http.ServeMux for this purpose.
\nSo if we implement an http.Handler
and use it together with an http.ServeMux
[^1], we can use Handle() to respond to HTTP requests:
[^1]: http.ServeMux
also satisfies the http.Handler
interface, as it implements a ServeHTTP(ResponseWriter, *Request) method.
package main\n\nimport (\n \"log\"\n \"net/http\"\n)\n\ntype HomeHandler struct {}\nfunc (h HomeHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {\n w.Write([]byte(\"Home\"))\n}\n\nfunc main() {\n mux := http.NewServeMux()\n handler := HomeHandler{}\n mux.Handle(\"/\", handler)\n if err := http.ListenAndServe(\":8888\", mux); err != nil {\n log.Fatal(err)\n }\n}\n
\nIn the example above we used the Handle()
method to respond to requests. But http.ServeMux
also has the HandleFunc() method. So what's the difference?
At first glance it looks like both accept a pattern and a handler. But Handle()
requires a handler that satisfies the http.Handler
interface. While HandleFunc()
accepts any function that defines http.ResponseWriter
and *http.Request
parameters:
Handle(pattern string, handler Handler){:go}
HandleFunc(pattern string, handler func(ResponseWriter, *Request)){:go}
So we can achieve the exact same thing as in the example above with the following:
\npackage main\n\nimport (\n \"log\"\n \"net/http\"\n)\n\nfunc main() {\n mux := http.NewServeMux()\n mux.HandleFunc(\"/\", func (w http.ResponseWriter, r *http.Request) {\n w.Write([]byte(\"Home\"))\n })\n if err := http.ListenAndServe(\":8888\", mux); err != nil {\n log.Fatal(err)\n }\n}\n
\nWe saw in the above examples that http.ServeMux
exposes the Handle()
and HandleFunc()
methods. But it turns out that instead of first creating a multiplexer with http.NewServeMux()
, it's also possible to just use http.Handle() or http.HandleFunc().
For example:
\npackage main\n\nimport (\n \"log\"\n \"net/http\"\n)\n\nfunc main() {\n http.HandleFunc(\"/\", func (w http.ResponseWriter, r *http.Request) {\n w.Write([]byte(\"Home\"))\n })\n if err := http.ListenAndServe(\":8888\", nil); err != nil {\n log.Fatal(err)\n }\n}\n
\nUsing these functions will actually make use of a \"default\" http.ServeMux
under the hood. This default multiplexer is defined by the standard library, and named DefaultServeMux[^2].
[^2]: DefaultServeMux
is just a ServeMux.
Turns out that a very useful type to know about when working with handlers is http.HandlerFunc.
\nThis type allows us to convert a \"plain\" handler function (i.e. func(ResponseWriter, *Request)
) into a \"real\" http.Handler
. Which is great, because this makes it more convenient to work with handlers.
So the following won't compile:
\nhandler := func (w http.ResponseWriter, r *http.Request) {\n w.Write([]byte(\"Home\"))\n}\nhttp.Handle(\"/\", handler) // ❌ Does not compile\n
\nBut this will compile:
\nhandler := func (w http.ResponseWriter, r *http.Request) {\n w.Write([]byte(\"Home\"))\n}\nhttp.Handle(\"/\", http.HandlerFunc(handler)) // ✅ Compiles\n
\nNote that http.HandlerFunc(handler)
does not invoke http.HandlerFunc
(it's a type, not a function!). But that it's doing a type conversion[^3] which converts handler
with type func(ResponseWriter, *Request)
into type http.HandlerFunc
.
[^3]: A type conversion is not the same thing as a type assertion.
\nMiddleware are typically small functions which take a request, do something with it, and then pass it to another middleware or the (final) handler.
\nIn Go, middleware will sit \"between\" the multiplexer and the handler responding to the HTTP requests.
\nA few examples of typical middleware use cases are:
\nGenerally speaking, in Go, functions that accept and return an http.Handler
are considered middleware:
func(next http.Handler) http.Handler\n
\nFor example:
\nfunc someMiddleware(next http.Handler) http.Handler {\n return http.HandlerFunc(func (w http.ResponseWriter, r *http.Request) {\n // Do something with `r`.\n\n next.ServeHTTP(w, r)\n })\n}\n
\nWhy does middleware accept and return an http.Handler
? This allows us to create a \"chain\" of handlers:
http.Handle(\"/\", middlewareA(middlewareB(middlewareC(handler))))\n
\nBut this can a get a bit unreadable. And that's why third-party libraries typically offer a Use()
function.
For example, this is how you'd use it with chi:
\nr := chi.NewRouter()\nr.Use(middlewareA, middlewareB, middlewareC)\nr.Get(\"/\", handler)\n
\nTo wrap up, I want to highlight some (sometimes unexpected) behavior I learned about while reading the docs and playing with http.ServeMux
.
When registering a handler for a pattern with http.ServeMux
, the pattern can either name fixed paths, or subtree paths.
Fixed paths do not have a trailing slash (e.g. /blog
or /blog/create
). And they are only matched when the URL exactly matches the pattern.
Subtree paths do have a trailing slash (e.g. /
or /blog/
or /blog/create/
). And they match all paths not matched by other registered patterns. So subtree paths kind of work like \"catch all\" patterns:
mux.HandleFunc(\"/\", homeHandler) // Subtree path\n
\nRequest path | \nCalls homeHandler | \n
---|---|
/ | \n✅ Yes | \n
/blog | \n✅ Yes | \n
/blog/ | \n✅ Yes | \n
/blog/create | \n✅ Yes | \n
/notfound | \n✅ Yes | \n
Note that subtree path patterns will match when not matched by other registered (fixed path) patterns:
\nmux.HandleFunc(\"/\", homeHandler) // Subtree path\nmux.HandleFunc(\"/blog\", blogHandler) // Fixed path\n
\nRequest path | \nCalls homeHandler | \nCalls blogHandler | \n
---|---|---|
/ | \n✅ Yes | \n❌ No | \n
/blog | \n❌ No | \n✅ Yes | \n
/blog/ | \n✅ Yes | \n❌ No | \n
/blog/create | \n✅ Yes | \n❌ No | \n
/notfound | \n✅ Yes | \n❌ No | \n
So to for example let handlers match the /blog/*
URL patterns, a subtree path must be used instead of a fixed path:
mux.HandleFunc(\"/\", homeHandler) // Subtree path\nmux.HandleFunc(\"/blog/\", blogHandler) // Subtree path\n
\nRequest path | \nCalls homeHandler | \nCalls blogHandler | \n
---|---|---|
/ | \n✅ Yes | \n❌ No | \n
/blog | \n❌ No | \n✅ Yes | \n
/blog/ | \n❌ No | \n✅ Yes | \n
/blog/create | \n❌ No | \n✅ Yes | \n
/notfound | \n✅ Yes | \n❌ No | \n
Also note that longer registered path patterns take precedence over shorter ones:
\nmux.HandleFunc(\"/blog/\", blogHandler) // Subtree path\nmux.HandleFunc(\"/blog/create/\", blogCreateHandler) // Subtree path\n
\nRequest path | \nCalls blogHandler | \nCalls blogCreateHandler | \n
---|---|---|
/ | \n❌ No | \n❌ No | \n
/blog | \n✅ Yes | \n❌ No | \n
/blog/ | \n✅ Yes | \n❌ No | \n
/blog/1 | \n✅ Yes | \n❌ No | \n
/blog/create | \n❌ No | \n✅ Yes | \n
/blog/create/1 | \n❌ No | \n✅ Yes | \n
/notfound | \n✅ Yes | \n❌ No | \n
If a subtree path pattern has been registered with http.ServeMux
, and it receives a request path without a trailing slash, it will redirect the request to the \"subtree root\" (i.e. redirect to the request path with the trailing slash).
To prevent this from happening you need to register the pattern for the path without the trailing slash.
\nFor example, when registering /blog/
, request to /blog
will redirect to /blog/
, unless /blog
is also registered.
http.ServeMux
will \"sanitize\" the URL request path and the Host header.
It will strip the port number and redirect any request containing .
or ..
elements, or repeated slashes, to a similar but cleaner URL.
http.ServeMux
only supports basic prefix matching. So it does not have support for:
For such features, you either need to implement that yourself (e.g. check the request method in a handler). Or use a third-party library like chi or gin.
\nI mostly wrote this as a reference for my future self. But perhaps it can be useful to others as well!
\n","date_published":"2022-12-22T00:00:00.000Z","date_modified":"2023-01-29T00:00:00.000Z","tags":["golang","http","middleware","multiplexer","request-handler","router"]},{"id":"https://www.danillouz.dev/posts/audio-transcoding-lambda/","url":"https://www.danillouz.dev/posts/audio-transcoding-lambda/","title":"Audio transcoding with AWS Lambda","summary":"Transcoding short audio files with Amazon Elastic Transcoder or FFmpeg.","content_html":"import { Image } from \"astro:assets\"
\nFor a side project I'm converting WebM audio files to MP3. I initially started doing this with Amazon Elastic Transcoder. But after doing the same with FFmpeg and Lambda Layers, my initial testing showed that the latter is around 10 times cheaper and 2 times faster for short audio recordings (~3 minute / ~3 MB files).
\nIf you just want to read the code, have a look at github.com/upstandfm/audio-transcoder.
\nMy side project is a web app that allows users to record their voice so others can listen to it. In the app I use the MediaStream Recording API (aka Media Recording API) to easily record audio from the user's input device. It works really well, and you don't have to use any external libraries!
\nThere's one catch though. At the time of this writing it only works in Firefox, Chrome and Opera. And it \"sort of\" works in Safari[^1]. Even though that's a bit disappointing, I'm okay with that for my use case.
\n[^1]: In Safari the Media Recording API is hidden behind a feature flag. And not all events are supported.
\nSo after I had built something functional that allowed me to record my voice, it turned out that the audio file I ended up with had to be transcoded if I wanted to listen to it across a wide range of browsers and devices.
\nBefore I can answer that, we need to explore what an audio file is.
\nWe can think of an audio file like a stream of data elements wrapped in a container. This container is formally called a media container format. And it's basically a file format (think file type) that can store different types of data elements (i.e. bits).
\nThe container describes how this data \"coexists\" in a file. Some container formats only support audio, like WAVE (usually referred to as WAV). And others support both audio and video, like WebM.
\nSo a container \"wraps\" data to store it in a file, but information can be stored in different ways. And we'll also want to compress the data to optimize for storage and/or bandwidth by encoding it (i.e. converting it from one \"form\" to another).
\nThis is where a codec (coder/decoder) comes into play. It handles all the processing that's required to encode (compress) and decode (decompress) the audio data.
\nTherefore, in order to define the format of an audio file (or a video file) we need both a container and a codec. For example, when the MPEG-1 Audio Layer 3 codec is used to store only audio data in an MPEG-4 container[^2], we get an MP3 file (even though it's technically still an MPEG format file).
\n[^2]: A container is not always required. WebRTC does not use a container at all. Instead, it streams the encoded audio and video tracks directly from one peer to another using MediaStreamTrack
objects to represent each track.
So what does transcoding mean? It's the process of converting one encoding into another. And if we convert one container format into another, this process is called transmuxing.
\nThere are a lot of codecs available. And each codec will have a different effect on the quality, size and/or compatibility of the audio file[^3].
\n[^3]: If you'd like to learn more about audio codecs, I recommend reading the Mozilla web audio codec guide.
\nYou might be wondering (like I was), if we can record audio directly in the browser and immediately use the result in our app, why do we even have to transcode it?
\nThe answer is: to optimize for compatibility. Because the Media Recording API can not record audio in all media formats.
\nFor example, MP3 has good compatibility across browsers and devices for playback, but is not supported by the Media Recording API. What formats are supported depend on the browser's specific implementation of said API.
\nWe can use the isTypeSupported method to figure out if we can record in a specific media type by calling it with a MIME type. Run the following code in the web console (e.g. in Firefox) to see it in action:
\nMediaRecorder.isTypeSupported(\"audio/mpeg\") // false\n
\nOkay, MP3 isn't supported. Which format can we use to record in then? It looks like WebM is a good choice:
\nMediaRecorder.isTypeSupported(\"audio/webm\") // true\n
\nAlso note that you can specify the codec in addition to the container:
\nMediaRecorder.isTypeSupported(\"audio/webm;codecs=opus\") // true\n
\nSo if we want to end up with MP3 files of the recordings, we need to transcode (and technically also transmux) the WebM audio recordings.
\nWe'll explore two implementations that both convert a WebM audio file to MP3:
\n\nFor both implementations we'll use the Serverless Framework and Node.js to write the code for the Lambda function that converts an audio file.
\nBefore we get started, make sure you have Node.js installed. And then use npm to install the Serverless Framework globally:
\nnpm i -G serverless\n
\nAdditionally, we'll need two S3 buckets to process and store the converted audio files:
\nAmazon Elastic Transcoder is a fully managed and highly scalable AWS service that can be used to transcode audio and video files.
\nWe can use this service to schedule a transcoding job in a pipeline. The pipeline knows from which bucket to read a file that needs to be converted, and to which bucket the converted file should be written. Whereas the job contains instructions on which file to transcode, and to what format it should be converted.
\nWe'll create a Lambda function that will \"listen\" to the S3 input bucket. And whenever a new object is created in that bucket, Lambda will schedule a transcoder job to create the MP3 file.
\nSo the flow will be like this:
\n\n\nAt the time of this writing AWS CloudFormation has no support for Amazon Elastic Transcoder. So you'll have to use the AWS web console to create and configure your pipeline(s).
\n
We'll go through the following steps to get it up and running:
\nNavigate to the Elastic Transcoder service in the AWS web console. Select a region (we'll use eu-west-1
), and click on \"Create New Pipeline\".
Create the pipeline and take note of the ARN and Pipeline ID. We'll need both to configure the Lambda function later on.
\n\nThe pipeline we created in the previous step requires a preset to work. Presets contain settings we want to be applied during the transcoding process. And lucky for us, AWS already has system presets to convert to MP3 files.
\nIn the web console, click on \"Presets\" and filter on the keyword \"MP3\". Select one and take note of its ARN and Preset ID. We'll also need these to configure the Lambda function.
\n\nAWS will already have created am IAM Role named Elastic_Transcoder_Default_Role
. But in order for the pipeline to read objects from the input bucket and write objects to the output bucket, we need to make sure the role has the required permissions to do so.
Create a new IAM Policy with the following configuration:
\n{\n \"Version\": \"2012-10-17\",\n \"Statement\": [\n {\n \"Effect\": \"Allow\",\n \"Action\": \"s3:GetObject\",\n \"Resource\": \"arn:aws:s3:::raw.recordings/*\"\n },\n {\n \"Effect\": \"Allow\",\n \"Action\": \"s3:PutObject\",\n \"Resource\": \"arn:aws:s3:::transcoded.recordings/*\"\n },\n {\n \"Effect\": \"Allow\",\n \"Action\": \"s3:ListBucket\",\n \"Resource\": \"arn:aws:s3:::transcoded.recordings\"\n }\n ]\n}\n
\nMake sure the resource ARNs of your input and output buckets are named correctly. And after the Policy has been created, attach it to Elastic_Transcoder_Default_Role
.
Create a new project named \"audio-transcoder\". Move into this directory and create a Serverless manifest in the project root:
\nservice: audio-transcoder\n\nprovider:\n name: aws\n runtime: nodejs10.x\n\npackage:\n exclude:\n - ./*\n - ./**/*.test.js\n include:\n - node_modules\n - src\n
\nAdd the Elastic Transcoder Pipeline ID, MP3 Preset ID and region (from step 1 and step 2) as environment variables:
\nservice: audio-transcoder\n\nprovider:\n name: aws\n runtime: nodejs10.x\n environment:\n TRANSCODE_AUDIO_PIPELINE_ID: \"1572538082044-xmgzaa\"\n TRANSCODER_MP3_PRESET_ID: \"1351620000001-300040\"\n ELASTIC_TRANSCODER_REGION: \"eu-west-1\"\n\npackage:\n exclude:\n - ./*\n - ./**/*.test.js\n include:\n - node_modules\n - src\n
\nUse the Elastic Transcoder Pipeline ARN and MP3 Preset ARN (from step 1 and step 2) to configure the Lambda with the required IAM permissions, so it can create transcoder jobs:
\nservice: audio-transcoder\n\nprovider:\n name: aws\n runtime: nodejs10.x\n environment:\n TRANSCODE_AUDIO_PIPELINE_ID: \"1572538082044-xmgzaa\"\n TRANSCODER_MP3_PRESET_ID: \"1351620000001-300040\"\n ELASTIC_TRANSCODER_REGION: \"eu-west-1\"\n iamRoleStatements:\n - Effect: Allow\n Action:\n - elastictranscoder:CreateJob\n Resource:\n - YOUR_PIPELINE_ARN # Replace this with the ARN from step 1\n - YOUR_PRESET_ARN # Replace this with the ARN from step 2\n\npackage:\n exclude:\n - ./*\n - ./**/*.test.js\n include:\n - node_modules\n - src\n
\nAnd finally, add the Lambda function definition. This Lambda will be executed whenever an object is created in the input bucket:
\nservice: audio-transcoder\n\nprovider:\n name: aws\n runtime: nodejs10.x\n environment:\n TRANSCODE_AUDIO_PIPELINE_ID: \"1572538082044-xmgzaa\"\n TRANSCODER_MP3_PRESET_ID: \"1351620000001-300040\"\n ELASTIC_TRANSCODER_REGION: \"eu-west-1\"\n iamRoleStatements:\n - Effect: Allow\n Action:\n - elastictranscoder:CreateJob\n Resource:\n - YOUR_PIPELINE_ARN # Replace this with the ARN from step 1\n - YOUR_PRESET_ARN # Replace this with the ARN from step 2\n\npackage:\n exclude:\n - ./*\n - ./**/*.test.js\n include:\n - node_modules\n - src\n\nfunctions:\n transcodeToMp3:\n handler: src/handler.transcodeToMp3\n description: Transcode an audio file to MP3\n events:\n - s3:\n bucket: \"raw.recordings\"\n event: \"s3:ObjectCreated:*\"\n existing: true\n
\nIn order to match the Lambda function definition in the Serverless manifest, create a file named handler.js
in src
. And export a method named transcodeToMp3
:
\"use strict\"\n\nmodule.exports.transcodeToMp3 = async () => {\n try {\n // Implementation goes here.\n } catch (err) {\n console.log(\"Transcoder Error: \", err)\n }\n}\n
\nIn the previous step we configured the Lambda to be executed whenever an object is created in the input bucket. This means that AWS will call the Lambda with an event
message that contains a list of Records
. And each Record
will contain an s3
object with information about the s3:ObjectCreated
event:
// \"event\" object:\n{\n \"Records\":[\n // \"Record\" object:\n {\n \"s3\":{\n // Contains information about the \"s3:ObjectCreated\" event.\n }\n }\n ]\n}\n
\nThe s3
object will contain a property called key
, which is the \"name\" of the file that was created in the input bucket. For example, if we upload a file named test.webm
to the S3 bucket, the value of key
will be the (URL encoded!) string test.webm
.
You can see the entire event message structure in the AWS S3 docs.
\nAlso be aware that you can get more than one Record
. So always process all of them:
\"use strict\"\n\nmodule.exports.transcodeToMp3 = async (event) => {\n try {\n for (const Record of event.Records) {\n const { s3 } = Record\n if (!s3) {\n continue\n }\n\n const { object: s3Object = {} } = s3\n const { key } = s3Object\n if (!key) {\n continue\n }\n\n const decodedKey = decodeURIComponent(key)\n // TODO: use \"decodedKey\" to schedule transcoder job.\n }\n } catch (err) {\n console.log(\"Transcoder Error: \", err)\n }\n}\n
\nFinally, initialize the transcoder client. And schedule a transcoder job for every created object in the input bucket:
\n\"use strict\"\n\nconst ElasticTranscoder = require(\"aws-sdk/clients/elastictranscoder\")\n\nconst {\n ELASTIC_TRANSCODER_REGION,\n TRANSCODE_AUDIO_PIPELINE_ID,\n TRANSCODER_MP3_PRESET_ID,\n} = process.env\n\nconst transcoderClient = new ElasticTranscoder({\n region: ELASTIC_TRANSCODER_REGION,\n})\n\nmodule.exports.transcodeToMp3 = async (event) => {\n try {\n for (const Record of event.Records) {\n const { s3 } = Record\n if (!s3) {\n continue\n }\n\n const { object: s3Object = {} } = s3\n const { key } = s3Object\n if (!key) {\n continue\n }\n\n const decodedKey = decodeURIComponent(key)\n await transcoderClient\n .createJob({\n PipelineId: TRANSCODE_AUDIO_PIPELINE_ID,\n Input: {\n Key: decodedKey,\n },\n Outputs: [\n {\n Key: decodedKey.replace(\"webm\", \"mp3\"),\n PresetId: TRANSCODER_MP3_PRESET_ID,\n },\n ],\n })\n .promise()\n }\n } catch (err) {\n console.log(\"Transcoder Error: \", err)\n }\n}\n
\nYou can read more about the createJob
API in the AWS JavaScript SDK docs.
In order to upload the Lambda to AWS, make sure you have your credentials configured. And then run the following command from the project root to release the Lambda:
\nsls deploy --region eu-west-1 --stage prod\n
\nWith everything up and running, we can now upload a WebM audio file to the input bucket to schedule a transcoder job. Navigate to the S3 service in the AWS web console:
\nThis action will trigger an s3:ObjectCreated
event. AWS will execute the Lambda function we deployed in the previous step, and it will schedule a transcoder job.
To get more information about a scheduled job, navigate to the Elastic Transcoder service in the AWS web console. Click on \"Jobs\", select your pipeline and click \"Search\". Here you can select a job to get more details about it.
\n\nIf it has status \"Complete\", there should be a file named test.mp3
in the output bucket!
FFmpeg is a cross-platform solution that can be used to convert audio and video files. And since it's a binary, we'll use a Lambda Layer to execute it from the Lambda function.
\nLambda Layers allow us to \"pull in\" extra dependencies into Lambda functions. A layer is basically a ZIP archive that contains some code. And in order to use a layer we first must create and publish one.
\nAfter we publish a layer we can configure any Lambda function to use it[^4]. AWS will then extract the layer to a special directory called /opt
. And the Lambda function runtime will be able to execute it.
[^4]: At the time of this writing a Lambda function can use up to 5 layers at a time.
\nWe're basically \"swapping out\" Amazon Elastic Transcoder with FFmpeg. Other than that the flow is still the same.
\nSo since we're still converting a WebM audio file to MP3 whenever it's uploaded to the input bucket, we can reuse the Lambda from the previous implementation by making these changes:
\nWe'll apply these changes by going through the following steps:
\nThe Serverless Framework makes it very easy to work with layers. To get started create a new project named \"lambda-layers\". Move into this directory and create a Serverless manifest in the project root:
\nservice: lambda-layers\n\nprovider:\n name: aws\n runtime: nodejs10.x\n\npackage:\n exclude:\n - ./*\n include:\n - layers\n\nlayers:\n ffmpeg:\n path: layers\n description: FFmpeg binary\n compatibleRuntimes:\n - nodejs10.x\n licenseInfo: GPL v2+, for more info see https://github.com/FFmpeg/FFmpeg/blob/master/LICENSE.md\n
\nThe layer is named ffmpeg
and the path
property dictates that the layer code will reside in a directory named layers
. Match this structure in the project by creating that directory first.
Move into the layers
directory and download a static build of FFmpeg from johnvansickle.com/ffmpeg[^5].
[^5]: These FFmpeg builds are all compatible with Amazon Linux 2. This is the operating system on which Lambda runs when the Node.js
runtime is used.
Use the recommended ffmpeg-git-amd64-static.tar.xz
master build:
curl -O https://johnvansickle.com/ffmpeg/builds/ffmpeg-git-amd64-static.tar.xz\n
\nExtract the files from the downloaded archive:
\ntar -xvf ffmpeg-git-amd64-static.tar.xz\n
\nRemove the downloaded archive:
\nrm ffmpeg-git-amd64-static.tar.xz\n
\nAnd rename the extracted directory to ffmpeg
, so it matches the configured layer name in the Serverless manifest. For example:
mv ffmpeg-git-20191029-amd64-static ffmpeg\n
\nYou should now have the following files and folder structure:
\nlambda-layers\n ├── layers\n │ └── ffmpeg\n │ ├── GPLv3.txt\n │ ├── ffmpeg\n │ ├── ffprobe\n │ ├── manpages\n │ ├── model\n │ ├── qt-faststart\n │ └── readme.txt\n └── serverless.yml\n
\nPublish the layer by running the following command from the project root:
\nsls deploy --region eu-west-1 --stage prod\n
\nWhen Serverless finishes deploying, navigate to the Lambda service in the AWS web console and click on \"Layers\". Here you should see the published layer. Click on it and take note of the ARN. We'll need it in the next step.
\n\nWe'll now be modifying the manifest file of the audio-transcoder
project.
First change the environment variables, and add the names of your input and output buckets. Then change the IAM permissions so the Lambda function can read from the input bucket and write to the output bucket. And finally, change the Lambda function to use the FFmpeg layer with the ARN from the previous step:
\nservice: audio-transcoder\n\nprovider:\n name: aws\n runtime: nodejs10.x\n environment:\n S3_INPUT_BUCKET_NAME: \"raw.recordings\"\n S3_OUTPUT_BUCKET_NAME: \"transcoded.recordings\"\n iamRoleStatements:\n - Effect: Allow\n Action:\n - s3:GetObject\n Resource: arn:aws:s3:::raw.recordings/*\n - Effect: Allow\n Action:\n - s3:PutObject\n Resource: arn:aws:s3:::transcoded.recordings/*\n\npackage:\n exclude:\n - ./*\n - ./**/*.test.js\n include:\n - node_modules\n - src\n\nfunctions:\n transcodeToMp3:\n handler: src/handler.transcodeToMp3\n description: Transcode an audio file to MP3\n events:\n - s3:\n bucket: \"raw.recordings\"\n event: \"s3:ObjectCreated:*\"\n existing: true\n layers:\n - YOUR_FFMPEG_LAYER_ARN # Replace this with the ARN from step 1\n
\nSince we have to read from the input bucket and write to the output bucket, replace the Elastic Transcoder client with the S3 client. And use the decodedKey
to get the WebM recording from the input bucket:
\"use strict\"\n\nconst S3 = require(\"aws-sdk/clients/s3\")\nconst { S3_INPUT_BUCKET_NAME, S3_OUTPUT_BUCKET_NAME } = process.env\nconst s3Client = new S3()\n\nmodule.exports.transcodeToMp3 = async (event) => {\n try {\n for (const Record of event.Records) {\n const { s3 } = Record\n if (!s3) {\n continue\n }\n\n const { object: s3Object = {} } = s3\n const { key } = s3Object\n if (!key) {\n continue\n }\n\n const decodedKey = decodeURIComponent(key)\n const webmRecording = await s3Client\n .getObject({\n Bucket: S3_INPUT_BUCKET_NAME,\n Key: decodedKey,\n })\n .promise()\n }\n } catch (err) {\n console.log(\"Transcoder Error: \", err)\n }\n}\n
\nThe S3 client returns an object that contains a Body
property. The value of Body
is a blob, which we'll feed to the FFmpeg layer and convert it to MP3.
We'll do this via a helper function that will spawn a synchronous child process which allows us to execute the ffmpeg
\"command\" (provided by the FFmpeg layer):
\"use strict\"\n\nconst { spawnSync } = require(\"child_process\")\n\nmodule.exports = {\n convertWebmToMp3(webmBlob) {\n spawnSync(\n \"/opt/ffmpeg/ffmpeg\", // \"/opt/:LAYER_NAME/:BINARY_NAME\"\n [\n // FFmpeg command arguments go here.\n ],\n { stdio: \"inherit\" }\n )\n\n // Rest of the implementation goes here.\n },\n}\n
\nThe ffmpeg
command requires the file system to do its magic. And we'll use a \"special\" directory called /tmp
[^6] for this.
[^6]: At the time of this writing the /tmp
directory allows you to temporarily store up to 512 MB.
First write the WebM blob to /tmp
so FFmpeg can read it. And then tell it to write the produced MP3 file back to the same directory:
\"use strict\"\n\nconst { spawnSync } = require(\"child_process\")\nconst { writeFileSync } = require(\"fs\")\n\nmodule.exports = {\n convertWebmToMp3(webmBlob) {\n const now = Date.now()\n const input = `/tmp/${now}.webm`\n const output = `/tmp/${now}.mp3`\n\n writeFileSync(input, webmBlob)\n\n spawnSync(\"/opt/ffmpeg/ffmpeg\", [\"-i\", input, output], {\n stdio: \"inherit\",\n })\n\n // TODO: cleanup and return MP3 blob.\n },\n}\n
\nNow read the produced MP3 file from disk, clean /tmp
, and return the MP3 blob:
\"use strict\"\n\nconst { spawnSync } = require(\"child_process\")\nconst { readFileSync, writeFileSync, unlinkSync } = require(\"fs\")\n\nmodule.exports = {\n convertWebmToMp3(webmBlob) {\n const now = Date.now()\n const input = `/tmp/${now}.webm`\n const output = `/tmp/${now}.mp3`\n\n writeFileSync(input, webmBlob)\n\n spawnSync(\"/opt/ffmpeg/ffmpeg\", [\"-i\", input, output], {\n stdio: \"inherit\",\n })\n\n const mp3Blob = readFileSync(output)\n\n unlinkSync(input)\n unlinkSync(output)\n\n return mp3Blob\n },\n}\n
\nFinally, use the MP3 blob in the handler to write it to the output bucket:
\n\"use strict\"\n\nconst S3 = require(\"aws-sdk/clients/s3\")\nconst ffmpeg = require(\"./ffmpeg\")\nconst { S3_INPUT_BUCKET_NAME, S3_OUTPUT_BUCKET_NAME } = process.env\nconst s3Client = new S3()\n\nmodule.exports.transcodeToMp3 = async (event) => {\n try {\n for (const Record of event.Records) {\n const { s3 } = Record\n if (!s3) {\n continue\n }\n\n const { object: s3Object = {} } = s3\n const { key } = s3Object\n if (!key) {\n continue\n }\n\n const decodedKey = decodeURIComponent(key)\n const webmRecording = await s3Client\n .getObject({\n Bucket: S3_INPUT_BUCKET_NAME,\n Key: decodedKey,\n })\n .promise()\n\n const mp3Blob = ffmpeg.convertWebmToMp3(webmRecording.Body)\n await s3Client\n .putObject({\n Bucket: S3_OUTPUT_BUCKET_NAME,\n Key: decodedKey.replace(\"webm\", \"mp3\"),\n ContentType: \"audio/mpeg\",\n Body: mp3Blob,\n })\n .promise()\n }\n } catch (err) {\n console.log(\"Transcoder Error: \", err)\n }\n}\n
\nRun the same command like before from the project root to release the Lambda:
\nsls deploy --region eu-west-1 --stage prod\n
\nWhen Serverless is done deploying, upload another WebM audio file to the input bucket.
\nBut nothing happens... Where's the MP3 file?
\nLets find out why this is happening by checking the Lambda function's log files in the AWS web console:
\naudio-transcoder-prod-transcodeToMp3
function.Here you should see the logs of the Lambda function.
\n\nThe logs tell us that FFmpeg is executing (hooray!) but that it doesn't complete (boo!).
\nIn the middle of the transcoding process the logs just say END
. And on the last line we see that the Lambda had a duration of 6006.17 ms
.
What's happening? The Lambda function takes \"too long\" to finish executing. By default Lambda has a timeout of 6 seconds[^7]. And after 6 seconds the Lambda function is still not done transcoding, so AWS terminates it.
\n[^7]: At the time of this writing the maximum timeout is 900 seconds.
\nHow do we solve this? By optimizing the Lambda function!
\nFirst let's just set the timeout to a larger value. For example 180 seconds. This way we can see how long it would actually take to complete the transcoding process:
\nfunctions:\n transcodeToMp3:\n timeout: 180\n
\nDeploy again. When Serverless is done, upload another WebM audio file, and check the logs.
\n\nThis time we see FFmpeg completes the transcoding process and that the Lambda had a duration of 7221.95 ms
. If we check the output bucket now, we'll see the MP3 file!
Transcoding the audio file in ~7 seconds isn't bad. Actually, it's very similar to Amazon Elastic Transcoder. But we can do better.
\nSomething that's very important when working with Lambda, is to always performance tune your functions. Or in other words, always make sure that a Lambda function has the optimum memory size configured.
\nThis is important because when you choose a higher memory setting, AWS will also give you an equivalent resource boost (like CPU). And this will usually positively impact the Lambda function's runtime duration. Which means you'll pay less money.
\nBy default a Lambda function has a memory setting of 128 MB. So lets increase it and compare results. A good strategy is usually to keep doubling memory and measure the duration. But for the sake of brevity, I'm jumping ahead to 2048 MB:
\nfunctions:\n transcodeToMp3:\n memorySize: 2048\n
\nDeploy again. And when Serverless is done, upload another WebM audio file and check the logs.
\n\nGreat, it's even faster now! Does this mean we can just keep increasing the memory and reap the benefits? Sadly, no. There's a tipping point where increasing the memory wont make it run faster.
\nFor example, increasing the memory to 3008 MB (the maximum memory limit at the time of this writing) will result in a similar runtime duration:
\nTest run | \nDuration | \nBilled Duration | \nCold Start Duration | \n
---|---|---|---|
1 | \n3775,63 ms | \n3800 ms | \n392,59 ms | \n
2 | \n3604,71 ms | \n3700 ms | \n- | \n
3 | \n3682,62 ms | \n3700 ms | \n- | \n
4 | \n3677,14 ms | \n3700 ms | \n- | \n
5 | \n3725,77 ms | \n3800 ms | \n- | \n
Test run | \nDuration | \nBilled Duration | \nCold Start Duration | \n
---|---|---|---|
1 | \n4125,12 ms | \n4200 ms | \n407,92 ms | \n
2 | \n3767,79 ms | \n3800 ms | \n- | \n
3 | \n3736,06 ms | \n3800 ms | \n- | \n
4 | \n3662,68 ms | \n3700 ms | \n- | \n
5 | \n3717,01 ms | \n3800 ms | \n- | \n
When done optimizing, make sure to apply a sensible value for the Lambda timeout. In this case, the default of 6 seconds would be a good one.
\nTo compare costs between both implementation, I did a couple of test runs converting a 3 minute (2,8 MB) WebM audio file to MP3.
\nThe following comparison is by no means very extensive, and your mileage may vary. But in my opinion I think it's good enough to get a decent impression of the cost range.
\nThe pricing page tells us we pay per minute (with 20 free minutes every month). And when we only transcode audio in region eu-west-1
, we'll currently pay $0,00522
per minute transcoding time.
These are the timing results of the test runs:
\nTest run | \nTranscoding Time | \n
---|---|
1 | \n7638 ms | \n
2 | \n6663 ms | \n
3 | \n7729 ms | \n
4 | \n6595 ms | \n
5 | \n8752 ms | \n
6 | \n7216 ms | \n
7 | \n7167 ms | \n
8 | \n6605 ms | \n
9 | \n6718 ms | \n
10 | \n8700 ms | \n
So the average transcoding time of the audio file would be:
\n7638 + 6663 + 7729 + 6595 + 8752 + 7216 + 7167 + 6605 + 6718 + 8700 = 73 783 ms\n\n73783 / 10 = 7378,3 ms\n\n7378,3 / 1000 = 7,3783 sec\n
\nLets say we would be transcoding 100 000
of these audio files per month, that would amount to a total transcoding time of:
7,3783 * 100 000 = 737 830 sec\n\n737 830 / 60 = 12 297,166 666 667 min\n
\nSince we pay $0,00522
per minute, the costs without free tier would be:
12 297,166 666 667 * 0,00522 = $64,191 21\n
\nAnd with free tier it would cost:
\n(12 297,166 666 667 - 20) * 0,00522 = $64,086 81\n
\nWe're using Lambda to schedule Amazon Elastic Transcoder jobs. So we also have to calculate those (minor if not negligible) costs.
\nThe Lambda pricing page tells us we pay for the number of requests and the duration (which depends on memory setting).
\nWe get 1 million requests for free every month, and after that you pay $0,20
per 1 million requests. Since we're only doing 1/10th of that in this example, I'm not including number of requests in the calculations. I'm only focusing on duration costs here.
These are the Lambda durations (with 128 MB memory) for the accompanying transcoder test runs:
\nTest run | \nDuration | \nBilled Duration | \nCold Start Duration | \n
---|---|---|---|
1 | \n494,08 ms | \n500 ms | \n401,61 ms | \n
2 | \n185,01 ms | \n200 ms | \n- | \n
3 | \n168,29 ms | \n200 ms | \n- | \n
4 | \n165,29 ms | \n200 ms | \n- | \n
5 | \n184,89 ms | \n200 ms | \n- | \n
6 | \n210,19 ms | \n300 ms | \n- | \n
7 | \n162,64 ms | \n200 ms | \n- | \n
8 | \n178,79 ms | \n200 ms | \n- | \n
9 | \n318,84 ms | \n400 ms | \n- | \n
10 | \n206,18 ms | \n300 ms | \n- | \n
The average billed duration would be:
\n500 + 200 + 200 + 200 + 200 + 300 + 200 + 200 + 400 + 300 = 2700 ms\n\n2700 / 10 = 270 ms\n\n270 / 1000 = 0,27 sec\n
\nIn region eu-west-1
, we'll currently pay $0,000 016 6667
for every GB per second (GB/sec). That means we first have to calculate \"how much\" memory the Lambda function uses for its runtime duration.
For 100 000
transcoding jobs per month (with 128 MB memory) that would be:
100 000 * 0,27 = 27000 sec\n\n(128 / 1024) * 27000 = 3375 GB/sec\n
\nCurrently you get 400 000
GB/sec for free every month, so depending on your scale you may or may not have to include it in your calculations. But without free tier it would cost:
3375 * 0,000 016 6667 = $0,056 250 113\n
\nThese are the Lambda durations (with 2048 MB memory) of the test runs:
\nTest run | \nDuration | \nBilled Duration | \nCold Start Duration | \n
---|---|---|---|
1 | \n4068,56 ms | \n4100 ms | \n408,17 ms | \n
2 | \n3880,55 ms | \n3900 ms | \n- | \n
3 | \n3910,52 ms | \n4000 ms | \n- | \n
4 | \n3794,20 ms | \n3800 ms | \n- | \n
5 | \n3856,73 ms | \n3900 ms | \n- | \n
6 | \n3859,06 ms | \n3900 ms | \n- | \n
7 | \n3810,93 ms | \n3900 ms | \n- | \n
8 | \n3799,19 ms | \n3800 ms | \n- | \n
9 | \n3858,49 ms | \n3900 ms | \n- | \n
10 | \n3866,53 ms | \n3900 ms | \n- | \n
The average billed duration would be:
\n4100 + 3900 + 4000 + 3800 + 3900 + 3900 + 3900 + 3800 + 3900 + 3900 = 39100 ms\n\n39100 / 10 = 3910 ms\n\n3910 / 1000 = 3,91 sec\n
\nIn region eu-west-1
, we'll currently pay $0,000 016 6667
for every GB/sec. For 100 000
transcoding jobs (with 2048 MB memory) that would be:
100 000 * 3,91 = 391 000 sec\n\n(2048 / 1024) * 391 000 = 782 000 GB/sec\n
\nWithout free tier it would cost:
\n782 000 * 0,000 016 6667 = $13,033 3594\n
\nWith free tier it would cost:
\n(782 000 - 400 000) * 0,000 016 6667 = $6,366 6794\n
\n<blockquote>\n<p>Data transferred between S3, Glacier, DynamoDB, SES, SQS, Kinesis, ECR, SNS, or SimpleDB and Lambda functions in the same AWS Region is free.</p>
\n<cite>\n<p>AWS Lambda: Pricing</p>\n</cite>\n</blockquote>
\nOtherwise, data transferred into and out of Lambda functions will be charged at the EC2 data transfer rates as listed under the “Data transfer” section.
\nCosts of transcoding 100 000
3 minute (2,8 MB) WebM audio files to MP3 per month:
Implementation | \nCost without free tier | \nCost with free tier | \n
---|---|---|
Amazon Elastic Transcoder | \n~ $64 | \n~ $64 | \n
FFmpeg and Lambda Layers | \n~ $13 | \n~ $6 | \n
That's a wrap! The post turned out a bit longer than expected, but hopefully it will prove useful in your transcoding adventures.
\nHappy transcoding!
\n","date_published":"2019-10-27T00:00:00.000Z","date_modified":"2023-02-03T00:00:00.000Z","tags":["audio","aws","elastic-transcoder","ffmpeg","lambda","lambda-layers","nodejs","serverless-framework","transcoding","tutorial"]},{"id":"https://www.danillouz.dev/posts/serverless-auth/","url":"https://www.danillouz.dev/posts/serverless-auth/","title":"Serverless auth","summary":"Protecting AWS API Gateway endpoints with AWS Lambda and Auth0.","content_html":"import { Image } from \"astro:assets\"
\nAuth is complicated. It can be difficult to reason about and can be hard to work with. The terminology can be complex as well, and terms are sometimes used interchangeably or can be ambiguous. Like saying \"auth\" to refer both to authentication (who are you?) and authorization (I know who you are, but what are you allowed to do?).
\nOn top of that it can also be challenging to know when to use what. Depending on what you're building and for whom, different auth protocols and strategies might be more suitable or required.
\nIn this post I won't be exploring these protocols and strategies in depth. Instead, I want to show that implementing something as complex as auth doesn't have to be too difficult. In order to do that I'll focus on a specific (but common) use case, and show a way to implement it.
\nIf you just want to read the code, have a look at github.com/danillouz/serverless-auth.
\nHow can we secure an HTTP API with a token based authentication strategy, so only authenticated and authorized clients can access it?
\nMore specifically:
\nI'll be using Auth0 as a third-party auth provider. This means that I'm choosing not to build (nor operate!) my own \"auth server\". So before we get started, I think it's important to explain the motivation behind this decision.
\nIn order to build an auth server you could use:
\nHeader
, Payload
and Signature
which are Base64 encoded and separated by a period. In effect, a JWT can be used as a bearer token[^1].[^1]: You can see how a JWT looks like by visiting jwt.io.
\nAnd with perhaps the help of some other tools and libraries you might be confident enough to build an auth server yourself. But I think that in most cases you shouldn't go down this route[^2]. Why not? Because it will cost a lot of time, energy and money to build, operate and maintain it.
\n[^2]: However, building an auth service yourself is a great learning experience. I think it's quite fun and challenging. And more importantly, you'll get a deeper understanding of the subject, which will be very helpful when you're navigating the \"documentation jungle\" of your favorite auth provider.
\nIf you do have a valid use case, plus enough resources, time and knowledge to build your own auth server, it might make sense for you. But I think that in most cases you should use a third party auth provider instead. Like AWS Cognito or Auth0.
\nThird-party auth providers give you all the fancy tooling, scalable infrastructure and resources you will need to provide a secure, reliable, performant and usable solution. Sure, you'll have to pay for it. But I think the pricing is typically fair. And it will most likely be a small fraction of what it would cost when you'd roll your own solution.
\nAnother sometimes overlooked benefit of choosing \"buy over build\", is that you'll get access to the domain expert's knowledge. Where they can advise and help you choose the best auth strategy for your use case.
\nAnd last but not least. By having someone else deal with the complexities and challenges of auth, you can focus on building your product!
\nOkay, let's get started.
\nWe'll build an Account API with a single endpoint that returns some profile information for a user.
\nRequirements and constraints are:
\nGET /profile
.name
with value Daniël
.200
.Authorization
request header.Authorization
request header value must have the format: Bearer <TOKEN>
.This API isn't very useful, but gives us something to work with in order to implement auth.
\nGET /profile\nAuthorization: Bearer eyJ...lKw\n
\n200 OK\nContent-Type: application/json\n\n{\n \"name\": \"Daniël\"\n}\n
\nWhen the Account API receives a request with the bearer token, it will have to verify the token with the help of Auth0. In order to do that, we first have to register our API with them:
\nAccount API
and https://api.danillouz.dev/account
[^3].RS256
as the signing algorithm (more on that later).[^3]: The \"Identifier\" doesn't have to be a \"real\" endpoint.
\n\nNow that our API is registered, we need to take note of the following (public) properties, to later on configure our Lambda Authorizer:
\nhttps://TENANT_NAME.REGION.auth0.com
. For example https://danillouz.eu.auth0.com
.https://TENANT_NAME.REGION.auth0.com/.well-known/jwks.json
. For example https://danillouz.eu.auth0.com/.well-known/jwks.json
.https://api.danillouz.dev/account
.You can also find these values under the \"Quick Start\" tab of the API details screen (you were redirected there after registering the API). For example, click on the \"Node.js\" tab and look for these properties:
\nissuer
jwksUri
audience
I haven't explained what a Lambda Authorizer is yet. In short, it's a feature of APIG to control access to an API.
\n<blockquote>\n<p>A Lambda authorizer is useful if you want to implement a custom authorization scheme that uses a bearer token authentication strategy such as OAuth.</p>
\n<cite>\n<p>AWS docs: Use API Gateway Lambda authorizers</p>\n</cite>\n</blockquote>
\nThere are actually two types of Lambda Authorizers:
\nWe'll be using the token based authorizer, because that supports bearer tokens.
\nWhen a Lambda Authorizer is configured, and a client makes a request to APIG, AWS will invoke the Lambda Authorizer first (i.e. before the Lambda handler). The Lambda Authorizer must then extract the bearer token from the Authorization
request header and validate it by:
[^4]: We get the JWKS URI, issuer and audience values from the Lambda Authorizer configuration.
\nOnly when the token passes these checks should the Lambda Authorizer return an IAM Policy document with \"Effect\"
set to \"Allow\"
:
{\n \"Version\": \"2012-10-17\",\n \"Statement\": [\n {\n \"Action\": \"execute-api:Invoke\",\n \"Effect\": \"Allow\",\n \"Resource\": \"ARN_OF_LAMBDA_HANDLER\"\n }\n ]\n}\n
\nIt's this policy that tells APIG it's allowed to invoke our downstream Lambda handler. In our case that will be the Lambda handler that returns the profile data.
\nAlternatively, the Lambda authorizer may deny invoking the downstream handler by setting \"Effect\"
to \"Deny\"
:
{\n \"Version\": \"2012-10-17\",\n \"Statement\": [\n {\n \"Action\": \"execute-api:Invoke\",\n \"Effect\": \"Deny\",\n \"Resource\": \"ARN_OF_LAMBDA_HANDLER\"\n }\n ]\n}\n
\nThis will make APIG respond with 403 Forbidden
. To make APIG respond with 401 Unauthorized
, return an Unauthorized
error from the Lambda Authorizer. We'll see this in action when implementing the Lambda Authorizer.
I found it good practice to only authenticate the caller from the Lambda Authorizer and apply authorization logic downstream (i.e. in the Lambda handlers).
\nThis may not be feasible in all use cases, but doing this keeps the Lambda Authorizer simple. So I think that ideally the Lambda Authorizer is only responsible for:
\nThe downstream Lambda handler can then use the authorization information to decide if it should execute its business logic for the specific caller or not.
\nFollowing this design also leads to a nice decoupling between the authentication and authorization logic (i.e. between the Lambda Authorizer and Lambda handlers).
\nWhen using OAuth 2.0, scopes can be used to apply authorization logic. In our case we could have a get:profile
scope. And a Lambda handler can check if the caller has been authorized to perform the action that is represented by the scope. If the scope is not present, the Lambda handler can return a 403 Forbidden
response back to the caller.
You can configure scope in the Auth0 dashboard by adding permissions to the registered API. Navigate to the \"Permissions\" tab of the API details screen and add get:profile
as a scope.
We'll use this scope when implementing the Account API. And you can read more about scopes in the Auth0 docs.
\nYou can propagate authorization information (like scopes) downstream by returning a context
object in the Lambda Authorizer's response:
\"use strict\"\n\nmodule.exports.authorizer = (event) => {\n const authResponse = {\n principalId: \"UNIQUE_ID\",\n policyDocument: {\n Version: \"2012-10-17\",\n Statement: [\n {\n Action: \"execute-api:Invoke\",\n Effect: \"Allow\",\n Resource: event.methodArn,\n },\n ],\n },\n context: {\n scope: \"get:profile\",\n },\n }\n\n return authResponse\n}\n
\nBut there's a caveat here. You can not set a JSON serializable object or array as a valid value of any key in the context
object. It can only be a String
, Number
or Boolean
:
context: {\n a: 'value', // ✅ OK\n b: 1, // ✅ OK\n c: true, // ✅ OK\n d: [9, 8, 7], // ❌ Will NOT be serialized\n e: { x: 'value', y: 99, z: false } // ❌ Will NOT be serialized\n}\n
\nAny \"valid\" properties passed to the context
object will be made available to downstream Lambda handlers via the event
object:
\"use strict\"\n\nmodule.exports.handler = (event) => {\n const { authorizer } = event.requestContext\n console.log(authorizer.scope) // \"get:profile\"\n}\n
\nWith that covered, we're ready to build the Lambda Authorizer and the Account API. But before we do, let's take a step back and solidify our mental model first.
\nTo summarize, we need the following components to protect our API:
\nGET /profile
endpoint to return the profile data.curl
as the client to send HTTP requests to the API with a token.We can visualize how these components will interact with each other like this.
\n\ncurl
will send an HTTP request to the GET /profile
endpoint with a token via the Authorization
request header.
When the HTTP request reaches APIG, it will check if a Lambda Authorizer is configured for the called endpoint. If so, APIG will invoke the Lambda Authorizer.
\nThe Lambda Authorizer will then:
\nAuthorization
request header.If the token is verified, the Lambda Authorizer will return an IAM Policy document with Effect
set to Allow
.
APIG will now evaluate the IAM Policy and if the Effect
is set to Allow
, it will invoke the specified Lambda handler.
The Lambda handler will execute and when the get:profile
scope is present, it will return the profile data back to the client.
Now for the easy part, writing the code!
\nWe'll do this by:
\nCreate a new directory for the code:
\nmkdir lambda-authorizers\n
\nMove to this directory and initialize a new npm project with:
\nnpm init -y\n
\nThis creates a package.json
file. Now you can install the following required npm dependencies:
npm i jsonwebtoken jwks-rsa\n
\nThe jsonwebtoken library will help use decode the bearer token (a JWT) and verify its signature, issuer and audience claims. The jwks-rsa library will help us fetch the JWKS from Auth0.
\nWe'll use the Serverless Framework to configure and upload the Lambda to AWS, so install it as a dev dependency:
\nnpm i -D serverless\n
\nCreate a Serverless manifest:
\nservice: lambda-authorizers\n\nprovider:\n name: aws\n runtime: nodejs8.10\n stage: ${opt:stage, 'prod'}\n region: ${opt:region, 'eu-central-1'}\n memorySize: 128\n timeout: 3\n\npackage:\n exclude:\n - ./*\n - ./**/*.test.js\n include:\n - node_modules\n - src\n
\nAdd the properties we got from the Lambda Authorizer configuration as environment variables. For example:
\nservice: lambda-authorizers\n\nprovider:\n name: aws\n runtime: nodejs8.10\n stage: ${opt:stage, 'prod'}\n region: ${opt:region, 'eu-central-1'}\n memorySize: 128\n timeout: 3\n environment:\n JWKS_URI: \"https://danillouz.eu.auth0.com/.well-known/jwks.json\"\n TOKEN_ISSUER: \"https://danillouz.eu.auth0.com/\"\n AUDIENCE: \"https://api.danillouz.dev/account\"\n\npackage:\n exclude:\n - ./*\n - ./**/*.test.js\n include:\n - node_modules\n - src\n
\nAnd add the Lambda function definition:
\nservice: lambda-authorizers\n\nprovider:\n name: aws\n runtime: nodejs8.10\n stage: ${opt:stage, 'prod'}\n region: ${opt:region, 'eu-central-1'}\n memorySize: 128\n timeout: 3\n environment:\n JWKS_URI: \"https://danillouz.eu.auth0.com/.well-known/jwks.json\"\n TOKEN_ISSUER: \"https://danillouz.eu.auth0.com/\"\n AUDIENCE: \"https://api.danillouz.dev/account\"\n\npackage:\n exclude:\n - ./*\n - ./**/*.test.js\n include:\n - node_modules\n - src\n\nfunctions:\n auth0VerifyBearer:\n handler: src/auth0.verifyBearer\n description: Verifies the bearer token with the help of Auth0\n
\nIn order to match the Lambda function definition in the Serverless manifest, create a file named auth0.js
in src
. And in that file export a method named verifyBearer
:
\"use strict\"\n\nmodule.exports.verifyBearer = async () => {\n try {\n // Lambda Authorizer implementation goes here.\n } catch (err) {\n console.log(\"Authorizer Error: \", err)\n throw new Error(\"Unauthorized\")\n }\n}\n
\nIf something goes wrong in the Lambda, we'll log the error and throw a new Unauthorized
error. This will make APIG return a 401 Unauthorized
response back to the caller[^5].
[^5]: The thrown error message must match the string \"Unauthorized\"
exactly for this to work.
The Lambda will first have to get the bearer token from the Authorization
request header. Create a helper function for that in src/get-token.js
. And in that file export a function named getToken
:
\"use strict\"\n\nmodule.exports = function getToken(event) {\n if (event.type !== \"TOKEN\") {\n throw new Error('Authorizer must be of type \"TOKEN\"')\n }\n\n const { authorizationToken: bearer } = event\n if (!bearer) {\n throw new Error('Authorization header with \"Bearer TOKEN\" must be provided')\n }\n\n const [, token] = bearer.match(/^Bearer (.*)$/) || []\n if (!token) {\n throw new Error(\"Invalid bearer token\")\n }\n\n return token\n}\n
\nHere we're only interested in TOKEN
events because we're implementing a token based authorizer. And we can access the value of the Authorization
request header via the event.authorizationToken
property.
Then require
and call the helper in the Lambda with the APIG HTTP input event as an argument:
\"use strict\"\n\nconst getToken = require(\"./get-token\")\n\nmodule.exports.verifyBearer = async (event) => {\n try {\n const token = getToken(event)\n } catch (err) {\n console.log(\"Authorizer Error: \", err)\n throw new Error(\"Unauthorized\")\n }\n}\n
\nNow we have the token, we need to verify it by:
\nWe'll use another helper function for this. Create one in src/verify-token.js
, and export a function named verifyToken
:
\"use strict\"\n\nmodule.exports = async function verifyToken(\n token,\n decodeJwt,\n getSigningKey,\n verifyJwt,\n issuer,\n audience\n) {\n // Step 1.\n const decoded = decodeJwt(token, { complete: true })\n\n if (!decoded || !decoded.header || !decoded.header.kid) {\n throw new Error(\"Invalid JWT\")\n }\n\n // Step 2.\n const { publicKey, rsaPublicKey } = await getSigningKey(decoded.header.kid)\n const signingKey = publicKey || rsaPublicKey\n\n // Step 3.\n return verifyJwt(token, signingKey, {\n issuer,\n audience,\n })\n}\n
\nAfter we decode the token with the option { complete: true }
, we can access the JWT header
data. And by using the kid JWT claim, we can find out which key was used to sign the token.
When we registered the API with Auth0 we chose the RS256
signing algorithm. This algorithm generates an asymmetric signature. Which basically means that Auth0 uses a private key to sign a JWT when it issues one. And we can use a public key (fetched via the JWKS URI) to verify the authenticity of the token.
First require the helper in the Lambda and pass the token
as the first argument when calling it:
\"use strict\"\n\nconst getToken = require(\"./get-token\")\nconst verifyToken = require(\"./verify-token\")\n\nmodule.exports.verifyBearer = async (event) => {\n try {\n const token = getToken(event)\n const verifiedData = await verifyToken(token)\n } catch (err) {\n console.log(\"Authorizer Error: \", err)\n throw new Error(\"Unauthorized\")\n }\n}\n
\nTo decode the token in the helper (step 1), we'll use the jsonwebtoken
library. It exposes a decode
method. Pass this method as the second argument when calling the helper:
\"use strict\"\n\nconst jwt = require(\"jsonwebtoken\")\n\nconst getToken = require(\"./get-token\")\nconst verifyToken = require(\"./verify-token\")\n\nmodule.exports.verifyBearer = async (event) => {\n try {\n const token = getToken(event)\n const verifiedData = await verifyToken(token, jwt.decode)\n } catch (err) {\n console.log(\"Authorizer Error: \", err)\n throw new Error(\"Unauthorized\")\n }\n}\n
\nTo fetch the public key from Auth0 (step 2) we'll use the jwks-rsa
library. It exposes a client with getSigningKey
method to fetch the key. Pas a \"promisified\" version of this method as the third argument when calling the helper:
\"use strict\"\n\nconst util = require(\"util\")\nconst jwt = require(\"jsonwebtoken\")\nconst jwksRSA = require(\"jwks-rsa\")\n\nconst getToken = require(\"./get-token\")\nconst verifyToken = require(\"./verify-token\")\n\nconst { JWKS_URI } = process.env\n\nconst jwksClient = jwksRSA({\n cache: true,\n rateLimit: true,\n jwksUri: JWKS_URI,\n})\nconst getSigningKey = util.promisify(jwksClient.getSigningKey)\n\nmodule.exports.verifyBearer = async (event) => {\n try {\n const token = getToken(event)\n const verifiedData = await verifyToken(token, jwt.decode, getSigningKey)\n } catch (err) {\n console.log(\"Authorizer Error: \", err)\n throw new Error(\"Unauthorized\")\n }\n}\n
\nFinally, to verify the token signature, issuer and audience claims (step 3) we'll use the jsonwebtoken
library again. It exposes a verify
method. Pass a \"promisified\" version of this method together with the TOKEN_ISSUER
and AUDIENCE
as the final arguments when calling the helper:
\"use strict\"\n\nconst util = require(\"util\")\nconst jwt = require(\"jsonwebtoken\")\nconst jwksRSA = require(\"jwks-rsa\")\n\nconst getToken = require(\"./get-token\")\nconst verifyToken = require(\"./verify-token\")\n\nconst { JWKS_URI, TOKEN_ISSUER, AUDIENCE } = process.env\n\nconst jwksClient = jwksRSA({\n cache: true,\n rateLimit: true,\n jwksUri: JWKS_URI,\n})\nconst getSigningKey = util.promisify(jwksClient.getSigningKey)\nconst verifyJwt = util.promisify(jwt.verify)\n\nmodule.exports.verifyBearer = async (event) => {\n try {\n const token = getToken(event)\n const verifiedData = await verifyToken(\n token,\n jwt.decode,\n getSigningKey,\n verifyJwt,\n TOKEN_ISSUER,\n AUDIENCE\n )\n } catch (err) {\n console.log(\"Authorizer Error: \", err)\n throw new Error(\"Unauthorized\")\n }\n}\n
\nWhen the helper verifies the token, it will return the JWT payload data (with all claims) as verifiedData
. For example:
{\n \"iss\": \"https://danillouz.eu.auth0.com/\",\n \"sub\": \"FHgLVARPk8oXjsP5utP8wYAnZePPAkw1@clients\",\n \"aud\": \"https://api.danillouz.dev/account\",\n \"iat\": 1560762850,\n \"exp\": 1560849250,\n \"azp\": \"FHgLVARPk8oXjsP5utP8wYAnZePPAkw1\",\n \"gty\": \"client-credentials\"\n}\n
\nWe'll use verifiedData
to create the authResponse
:
\"use strict\"\n\nconst util = require(\"util\")\nconst jwt = require(\"jsonwebtoken\")\nconst jwksRSA = require(\"jwks-rsa\")\n\nconst getToken = require(\"./get-token\")\nconst verifyToken = require(\"./verify-token\")\n\nconst { JWKS_URI, TOKEN_ISSUER, AUDIENCE } = process.env\n\nconst jwksClient = jwksRSA({\n cache: true,\n rateLimit: true,\n jwksUri: JWKS_URI,\n})\nconst getSigningKey = util.promisify(jwksClient.getSigningKey)\nconst verifyJwt = util.promisify(jwt.verify)\n\nmodule.exports.verifyBearer = async (event) => {\n try {\n const token = getToken(event)\n const verifiedData = await verifyToken(\n token,\n jwt.decode,\n getSigningKey,\n verifyJwt,\n TOKEN_ISSUER,\n AUDIENCE\n )\n const authResponse = {\n principalId: verifiedData.sub,\n policyDocument: {\n Version: \"2012-10-17\",\n Statement: [\n {\n Action: \"execute-api:Invoke\",\n Effect: \"Allow\",\n Resource: event.methodArn,\n },\n ],\n },\n }\n return authResponse\n } catch (err) {\n console.log(\"Authorizer Error: \", err)\n throw new Error(\"Unauthorized\")\n }\n}\n
\nThe authResponse.principalId
property must represent a unique (user) identifier associated with the token sent by the client. Auth0 provides this via the sub
claim and ours has the value:
{\n \"iss\": \"https://danillouz.eu.auth0.com/\",\n \"sub\": \"FHgLVARPk8oXjsP5utP8wYAnZePPAkw1@clients\", // Principal ID\n \"aud\": \"https://api.danillouz.dev/account\",\n \"iat\": 1560762850,\n \"exp\": 1560849250,\n \"azp\": \"FHgLVARPk8oXjsP5utP8wYAnZePPAkw1\",\n \"gty\": \"client-credentials\"\n}\n
\nNote that if you use an Auth0 test token (like we'll do in a bit), the sub
claim will be postfixed with @clients
. This is because Auth0 automatically created a \"Test Application\" for us when we registered the Account API with them. And it's via this application that we obtain the test token, obtained via the client credentials grant (specified by the gty
claim).
In this case the test application represents a \"machine\" and not a user. But that's okay because the machine has a unique identifier the same way a user would have (by means of a client ID). This means that this implementation will also work when using \"user centric\" auth flows like the implicit grant.
\nYou can find the test application in the Auth0 dashboard by navigating to \"Applications\" and selecting \"Account API (Test Application)\".
\n\nThe ARN of the Lambda handler associated with the called endpoint can be obtained from event.methodArn
. APIG will use this ARN to invoke said Lambda handler. In our case this will be the Lambda handler that gets the profile data.
Like mentioned when discussing scopes, Auth0 can provide scopes as authorization information. In order for Auth0 to do this, we need to \"grant\" our client the get:profile
scope. In our case, the client is the \"Test Application\" that has been created for us.
Navigate to the \"APIs\" tab in the \"Test Application\" details and click on the \"right pointing chevron\" (circled in red) to the right of \"Account API\".
\n\nThen check the get:profile
scope, click \"Update\" and click \"Continue\".
Now the configured scope will be a claim on issued test tokens, and part of the verifiedData
:
{\n \"iss\": \"https://danillouz.eu.auth0.com/\",\n \"sub\": \"FHgLVARPk8oXjsP5utP8wYAnZePPAkw1@clients\",\n \"aud\": \"https://api.danillouz.dev/account\",\n \"iat\": 1560762850,\n \"exp\": 1560849250,\n \"azp\": \"FHgLVARPk8oXjsP5utP8wYAnZePPAkw1\",\n \"scope\": \"get:profile\", // Scope is now a claim\n \"gty\": \"client-credentials\"\n}\n
\nSo we can propagate it to downstream Lambda handlers like this:
\n\"use strict\"\n\nconst util = require(\"util\")\nconst jwt = require(\"jsonwebtoken\")\nconst jwksRSA = require(\"jwks-rsa\")\n\nconst getToken = require(\"./get-token\")\nconst verifyToken = require(\"./verify-token\")\n\nconst { JWKS_URI, TOKEN_ISSUER, AUDIENCE } = process.env\n\nconst jwksClient = jwksRSA({\n cache: true,\n rateLimit: true,\n jwksUri: JWKS_URI,\n})\nconst getSigningKey = util.promisify(jwksClient.getSigningKey)\nconst verifyJwt = util.promisify(jwt.verify)\n\nmodule.exports.verifyBearer = async (event) => {\n try {\n const token = getToken(event)\n const verifiedData = await verifyToken(\n token,\n jwt.decode,\n getSigningKey,\n verifyJwt,\n TOKEN_ISSUER,\n AUDIENCE\n )\n const authResponse = {\n principalId: verifiedData.sub,\n policyDocument: {\n Version: \"2012-10-17\",\n Statement: [\n {\n Action: \"execute-api:Invoke\",\n Effect: \"Allow\",\n Resource: event.methodArn,\n },\n ],\n },\n context: {\n scope: verifiedData.scope, // Propagate scope downstream\n },\n }\n return authResponse\n } catch (err) {\n console.log(\"Authorizer Error: \", err)\n throw new Error(\"Unauthorized\")\n }\n}\n
\nFinally, add a release command to the package.json
:
{\n \"scripts\": {\n \"test\": \"echo \\\"Error: no test specified\\\" && exit 1\",\n \"release\": \"serverless deploy --stage prod\"\n },\n \"dependencies\": {\n \"jsonwebtoken\": \"^8.5.1\",\n \"jwks-rsa\": \"^1.5.1\"\n },\n \"devDependencies\": {\n \"serverless\": \"^1.45.1\"\n }\n}\n
\nAnd to upload the Lambda to AWS, sign up and make sure you have your credentials configured. Then release the Lambda by running npm run release
:
Serverless: Packaging service...\nServerless: Excluding development dependencies...\nServerless: Creating Stack...\nServerless: Checking Stack create progress...\nServerless: Stack create finished...\nServerless: Uploading CloudFormation file to S3...\nServerless: Uploading artifacts...\nServerless: Uploading service lambda-authorizers.zip file to S3...\nServerless: Validating template...\nServerless: Updating Stack...\nServerless: Checking Stack update progress...\nServerless: Stack update finished...\nService Information\n\nservice: lambda-authorizers\nstage: prod\nregion: eu-central-1\nstack: lambda-authorizers-prod\nresources: 5\napi keys:\n None\nendpoints:\n None\nfunctions:\n auth0VerifyBearer: lambda-authorizers-prod-auth0VerifyBearer\nlayers:\n None\n
\nNow go to the AWS Console and visit the \"Lambda\" service. Find lambda-authorizers-prod-auth0VerifyBearer
under \"Functions\" and take note of the ARN in the top right corner.
We'll need this to configure the Account API in the next part.
\nWe'll do this by:
\nSimilar to the Lambda Authorizer, create a new directory for the code:
\nmkdir account-api\n
\nMove to this directory and initialize a new npm project with:
\nnpm init -y\n
\nThis creates a package.json
file. Again, we'll use the Serverless Framework to configure and upload the Lambda to AWS, so install it as a dev dependency:
npm i -D serverless\n
\nCreate a Serverless manifest, and add the Lambda function definition for the GET /profile
endpoint handler:
service: account-api\n\nprovider:\n name: aws\n runtime: nodejs8.10\n stage: ${opt:stage, 'prod'}\n region: ${opt:region, 'eu-central-1'}\n memorySize: 128\n timeout: 3\n\npackage:\n exclude:\n - ./*\n - ./**/*.test.js\n include:\n - node_modules\n - src\n\nfunctions:\n getProfile:\n handler: src/handler.getProfile\n description: Gets the user profile data\n events:\n - http:\n path: /profile\n method: get\n
\nIn order to match the Lambda function definition in the Serverless manifest, create a file named handler.js
in src
. And in that file export a method named getProfile
:
\"use strict\"\n\nmodule.exports.getProfile = async () => {\n try {\n // Lambda handler implementation goes here.\n } catch (err) {\n const statusCode = err.code || 500\n return {\n statusCode,\n body: JSON.stringify({\n message: err.message,\n info: err.info,\n }),\n }\n }\n}\n
\nIf something goes wrong in the Lambda, we'll return an error response as HTTP output back to the caller.
\nOtherwise we'll return the profile data:
\n\"use strict\"\n\nmodule.exports.getProfile = async () => {\n try {\n const profileData = {\n name: \"Daniël\",\n }\n return {\n statusCode: 200,\n body: JSON.stringify(profileData),\n }\n } catch (err) {\n const statusCode = err.code || 500\n return {\n statusCode,\n body: JSON.stringify({\n message: err.message,\n info: err.info,\n }),\n }\n }\n}\n
\nBefore we enable auth, let's first release the API to see if we can call the endpoint.
\nAdd a release command to the package.json
:
{\n \"scripts\": {\n \"test\": \"echo \\\"Error: no test specified\\\" && exit 1\",\n \"release\": \"serverless deploy --stage prod\"\n },\n \"devDependencies\": {\n \"serverless\": \"^1.45.1\"\n }\n}\n
\nThen release the Lambda by running npm run release
:
Serverless: Packaging service...\nServerless: Excluding development dependencies...\nServerless: Creating Stack...\nServerless: Checking Stack create progress...\nServerless: Stack create finished...\nServerless: Uploading CloudFormation file to S3...\nServerless: Uploading artifacts...\nServerless: Uploading service account-api.zip file to S3...\nServerless: Validating template...\nServerless: Updating Stack...\nServerless: Checking Stack update progress...\nServerless: Stack update finished...\nService Information\n\nservice: account-api\nstage: prod\nregion: eu-central-1\nstack: account-api-prod\nresources: 10\napi keys:\n None\nendpoints:\n GET - https://9jwh.execute-api.eu-central-1.amazonaws.com/prod/profile\nfunctions:\n getProfile: account-api-prod-getProfile\nlayers:\n None\n
\nNow try to call the endpoint that has been created for you. For example:
\ncurl https://9jwh.execute-api.eu-central-1.amazonaws.com/prod/profile\n
\nIt should return:
\n200 OK\nContent-Type: application/json\n\n{\n \"name\": \"Daniël\"\n}\n
\nNow we know the endpoint is working, we'll protect it by adding a custom authorizer
property in the serverless.yaml
manifest:
service: account-api\n\ncustom:\n authorizer:\n arn: LAMBDA_AUTHORIZER_ARN\n resultTtlInSeconds: 0\n identitySource: method.request.header.Authorization\n identityValidationExpression: '^Bearer [-0-9a-zA-z\\.]*$'\n type: token\n\nprovider:\n name: aws\n runtime: nodejs8.10\n stage: ${opt:stage, 'prod'}\n region: ${opt:region, 'eu-central-1'}\n memorySize: 128\n timeout: 3\n profile: danillouz\n\npackage:\n exclude:\n - ./*\n - ./**/*.test.js\n include:\n - node_modules\n - src\n\nfunctions:\n getProfile:\n handler: src/handler.getProfile\n description: Gets the user profile\n events:\n - http:\n path: /profile\n method: get\n authorizer: ${self:custom.authorizer}\n
\nLet's go over the authorizer
properties:
arn
: must be the value of the Lambda Authorizer ARN we released before.resultTtlInSeconds
: used to cache the IAM Policy document returned from the Lambda Authorizer[^6].identitySource
: where APIG should \"look\" for the bearer token.identityValidationExpression
: the expression used to extract the token from the identitySource
.[^6]: Caching is disabled when set to 0
. When caching is enabled and a policy document has been cached, the Lambda Authorizer will not be executed. According to the AWS docs the default value is 300
seconds and the max value is 3600
seconds.
Now the Lambda Authorizer is configured and we also propagate the get:profile
scope from the Lambda Authorizer, we can check if a caller has been granted the required scope. If not, we'll return a 403 Forbidden
response back to the caller:
\"use strict\"\n\nconst REQUIRED_SCOPE = \"get:profile\"\n\nmodule.exports.getProfile = async (event) => {\n try {\n const { authorizer = {} } = event.requestContext\n const { scope = \"\" } = authorizer\n const hasScope = scope.split(\" \").includes(REQUIRED_SCOPE)\n if (!hasScope) {\n const err = new Error(\"Forbidden\")\n err.code = 403\n err.info = 'scope \"get:profile\" is required'\n throw err\n }\n\n const profileData = {\n name: \"Daniël\",\n }\n return {\n statusCode: 200,\n body: JSON.stringify(profileData),\n }\n } catch (err) {\n const statusCode = err.code || 500\n return {\n statusCode,\n body: JSON.stringify({\n message: err.message,\n info: err.info,\n }),\n }\n }\n}\n
\nNote that the authorizer.scope
is a string and that it may contain more than one scope value. When multiple scopes are configured, they will be space separated like this:
\"get:profile update:profile\"\n
\nDo another release by running npm run release
. And after Serverless finishes, go to the AWS Console and visit the \"API Gateway\" service. Navigate to \"prod-account-api\" and click on the \"GET\" resource under \"/profile\". You should now see that the \"Method Request\" tile has a property \"Auth\" set to auth0VerifyBearer
.
This means our GET /profile
endpoint is properly configured with a Lambda Authorizer. And we now require a bearer token to get the profile data. Let's verify this by making the same curl
request like before (without a token):
curl https://9jwh.execute-api.eu-central-1.amazonaws.com/prod/profile\n
\nIt should return:
\n401 Unauthorized\nContent-Type: application/json\n\n{\n \"message\": \"Unauthorized\"\n}\n
\nWe can get a test token from the Auth0 dashboard by navigating to the \"Test\" tab in the API details screen.
\n\nIf you scroll to the bottom, you'll see a curl
command displayed with a ready to use test token:
curl --request GET \\\n --url http://path_to_your_api/ \\\n --header 'authorization: Bearer eyJ...lKw'\n
\nPretty cool right! Use this, but set the URL to your profile endpoint. For example:
\ncurl --request GET \\\n --url https://9jwh.execute-api.eu-central-1.amazonaws.com/prod/profile \\\n --header 'authorization: Bearer eyJ...lKw'\n
\nThis should return the profile data again:
\n200 OK\nContent-Type: application/json\n\n{\n \"name\": \"Daniël\"\n}\n
\nAlso, sending a token without the required scope should return a 403
:
403 Forbidden\nContent-Type: application/json\n\n{\n \"message\": \"Error: Forbidden\",\n \"info\": \"scope \\\"get:profile\\\" is required\"\n}\n
\nAwesome! We successfully secured our API with a token based authentication strategy. So only authenticated and authorized clients can access it now!
\nOn a final note, when your API needs to return CORS headers, make sure to add a custom APIG Response as well:
\nservice: account-api\n\ncustom:\n authorizer:\n arn: LAMBDA_AUTHORIZER_ARN\n resultTtlInSeconds: 0\n identitySource: method.request.header.Authorization\n identityValidationExpression: '^Bearer [-0-9a-zA-z\\.]*$'\n type: token\n\nprovider:\n name: aws\n runtime: nodejs8.10\n stage: ${opt:stage, 'prod'}\n region: ${opt:region, 'eu-central-1'}\n memorySize: 128\n timeout: 3\n\npackage:\n exclude:\n - ./*\n - ./**/*.test.js\n include:\n - node_modules\n - src\n\nfunctions:\n getProfile:\n handler: src/handler.getProfile\n description: Gets the user profile\n events:\n - http:\n path: /profile\n method: get\n authorizer: ${self:custom.authorizer}\n\nresources:\n Resources:\n GatewayResponseDefault4XX:\n Type: \"AWS::ApiGateway::GatewayResponse\"\n Properties:\n ResponseParameters:\n gatewayresponse.header.Access-Control-Allow-Origin: \"'*'\"\n gatewayresponse.header.Access-Control-Allow-Headers: \"'*'\"\n ResponseType: DEFAULT_4XX\n RestApiId:\n Ref: \"ApiGatewayRestApi\"\n GatewayResponseDefault5XX:\n Type: \"AWS::ApiGateway::GatewayResponse\"\n Properties:\n ResponseParameters:\n gatewayresponse.header.Access-Control-Allow-Origin: \"'*'\"\n gatewayresponse.header.Access-Control-Allow-Headers: \"'*'\"\n ResponseType: DEFAULT_5XX\n RestApiId:\n Ref: \"ApiGatewayRestApi\"\n
\nWhen the Lambda Authorizer throws an error or returns a \"Deny\" policy, APIG will not execute any Lambda handlers. This means that the CORS settings you added to the Lambda handler wont be applied. That's why we must define additional APIG response resources, to make sure we always return the proper CORS headers.
\nIn this post I showed a way to implement \"serverless auth\" using a machine client. But you can use something like Auth0 Lock and implement a user centric auth flow. This would allow users to sign up and log in to (for example) a web app, and get a token from Auth0. The web app can then use the token to send requests (on behalf of a user) to a protected API.
\nYou can find all code at github.com/danillouz/serverless-auth.
\n","date_published":"2019-06-19T00:00:00.000Z","date_modified":"2023-02-03T00:00:00.000Z","tags":["auth","auth0","api-gateway","aws","jwk","jwt","lambda","nodejs","serverless-framework","tutorial"]},{"id":"https://www.danillouz.dev/posts/lambda-nodejs-event-loop/","url":"https://www.danillouz.dev/posts/lambda-nodejs-event-loop/","title":"AWS Lambda and the Node.js event loop","summary":"Lambda can freeze and thaw its execution context, which can impact Node.js event loop behavior.","content_html":"import { Image } from \"astro:assets\"
\nOne of the more surprising things I learned recently while working with AWS Lambda is how it interacts with the Node.js event loop.
\nLambda is powered by a virtualization technology. And to optimize performance it can freeze and thaw the execution context of your code so it can be reused.
\nThis will make your code run faster, but can impact the \"expected\" event loop behavior. We'll explore this in detail. But before we dive in, lets quickly refresh the Node.js concurrency model.
\nIf you're already familiar with the event loop, you can jump straight to the AWS Lambda section.
\nNode.js is single threaded and the event loop is the concurrency model that allows non-blocking I/O operations to be performed[^1].
\n[^1]: The event loop is what allows Node.js to perform non-blocking I/O operations (despite the fact that JavaScript is single-threaded) by offloading operations to the system kernel whenever possible.
\nHow? Well, we'll have to discuss the call stack and the task queue first.
\nFunction calls form a stack of frames, where each frame represents a single function call.
\nEvery time a function is called, it's pushed onto the stack (i.e. added to the stack). And when the function is done executing, it's popped off the stack (i.e. removed from the stack).
\nThe frames in a stack are popped off in <abbr title=\"Last In First Out\">LIFO</abbr> order.
\n\nEach frame stores information about the invoked function. Like the arguments the function was called with and any variables defined inside the called function's body.
\nWhen we execute the following code:
\n\"use strict\"\n\nfunction work() {\n console.log(\"do work\")\n}\n\nfunction main() {\n console.log(\"main start\")\n work()\n console.log(\"main end\")\n}\n\nmain()\n
\nWe can visualize the call stack over time like this.
\n\nWhen the script starts executing, the call stack is empty.
\nmain()
is called, and pushed onto the call stack:
\"use strict\"\n\nfunction work() {\n console.log(\"do work\")\n}\n\nfunction main() {\n console.log(\"main start\")\n work()\n console.log(\"main end\")\n}\n\nmain()\n
\nmain
, console.log(\"main start\")
is called, and pushed onto the call stack:\"use strict\"\n\nfunction work() {\n console.log(\"do work\")\n}\n\nfunction main() {\n console.log(\"main start\")\n work()\n console.log(\"main end\")\n}\n\nmain()\n
\n\nconsole.log
executes, prints main start
, and is popped off the call stack.
main
continues executing, calls work()
, and is pushed onto the call stack:
\"use strict\"\n\nfunction work() {\n console.log(\"do work\")\n}\n\nfunction main() {\n console.log(\"main start\")\n work()\n console.log(\"main end\")\n}\n\nmain()\n
\nwork
, console.log(\"do work\")
is called, and pushed onto the call stack:\"use strict\"\n\nfunction work() {\n console.log(\"do work\")\n}\n\nfunction main() {\n console.log(\"main start\")\n work()\n console.log(\"main end\")\n}\n\nmain()\n
\n\nconsole.log
executes, prints do work
, and is popped off the call stack.
work
finishes executing, and is popped off the call stack.
main
continues executing, calls console.log(\"main end\")
and is pushed onto the call stack:
\"use strict\"\n\nfunction work() {\n console.log(\"do work\")\n}\n\nfunction main() {\n console.log(\"main start\")\n work()\n console.log(\"main end\")\n}\n\nmain()\n
\n\nconsole.log
executes, prints main end
, and is popped off the call stack.
main
finishes executing, and is popped off the call stack. The call stack is empty again and the script finishes executing.
This code didn't interact with any asynchronous (internal) APIs. But when it does (like when calling setTimeout(callback)
) it makes use of the task queue.
Any asynchronous work in the runtime is represented as a task in a queue, or in other words, a message queue.
\nEach message can be thought of as a function that will be called in <abbr title=\"First In First Out\">FIFO</abbr> order to handle said work. For example, the callback provided to the setTimeout
or Promise
API.
Additionally, each message is processed completely before any other message is processed. This means that whenever a function runs it can't be interrupted. This behavior is called run-to-completion and makes it easier to reason about our JavaScript programs.
\nMessages get enqueued (i.e. added to the queue) and at some point messages will be dequeued (i.e. removed from the queue).
\nWhen? How? This is handled by the Event Loop.
\nThe event loop can be literally thought of as a loop that runs forever, and where every cycle is referred to as a tick.
\nOn every tick the event loop will check if there's any work in the task queue. If there is, it will execute the task (i.e. call a function), but only if the call stack is empty.
\nThe event loop can be described with the following pseudo code[^2]:
\n[^2]: Taken from MDN.
\nwhile (queue.waitForMessage()) {\n queue.processNextMessage()\n}\n
\nTo summarize:
\nsetTimeout
or Promise
) the corresponding callbacks are eventually added to the task queue.With that covered, we can explore how the AWS Lambda execution environment interacts with the Node.js event loop.
\nAWS Lambda invokes a Lambda function via an exported handler function, e.g. exports.handler
. When Lambda invokes this handler it calls it with 3 arguments:
handler(event, context, callback)\n
\nThe callback
argument may be used to return information to the caller and to signal that the handler function has completed, so Lambda may end it. For that reason you don't have to call it explicitly. Meaning, if you don't call it Lambda will call it for you[^3].
[^3]: When using Node.js version 8.10
or above, you may also return a Promise
instead of using the callback function. In that case you can also make your handler async
, because async
functions return a Promise
.
From here on we'll use a simple script as a \"baseline\" to reason about the event loop behavior. Create a file called timeout.js
with the following contents:
\"use strict\"\n\nfunction timeout(ms) {\n console.log(\"timeout start\")\n\n return new Promise((resolve) => {\n setTimeout(() => {\n console.log(`timeout cb fired after ${ms} ms`)\n resolve()\n }, ms)\n })\n}\n\nasync function main() {\n console.log(\"main start\")\n timeout(5e3)\n console.log(\"main end\")\n}\n\nmain()\n
\nWhen we execute this script locally (not via Lambda) with node timeout.js
, the following will print:
main start\ntimeout start\nmain end\ntimeout cb fired after 5000 ms\n
\nThe last message takes 5 seconds to print, but the script does not stop executing before it does.
\nNow lets modify the code from timeout.js
so it's compatible with Lambda:
\"use strict\"\n\nfunction timeout(ms) {\n console.log(\"timeout start\")\n\n return new Promise((resolve) => {\n setTimeout(() => {\n console.log(`timeout cb fired after ${ms} ms`)\n resolve()\n }, ms)\n })\n}\n\nasync function main() {\n console.log(\"main start\")\n timeout(5e3)\n console.log(\"main end\")\n}\n\nexports.handler = main\n
\nYou can create a new function in the AWS Lambda console and paste in the code from above. Run it, sit back and enjoy.
\n\nWait, what? Lambda just ended the handler function without printing the last message timeout cb fired after 5000 ms
. Lets run it again.
It now prints timeout cb fired after 5000 ms
first and then the other ones! So what's going on here?
AWS Lambda takes care of provisioning and managing resources needed to run your functions. When a Lambda function is invoked, an execution context is created for you based on the configuration you provide. The execution context is a temporary runtime environment that initializes any external dependencies of your Lambda function.
\nAfter a Lambda function is called, Lambda maintains the execution context for some time in anticipation of another invocation of the Lambda function (for performance benefits). It freezes the execution context after a Lambda function completes and may choose to reuse (thaw) the same execution context when the Lambda function is called again (but it doesn't have to).
\nIn the AWS docs we can find the following regarding this subject:
\n<blockquote>\n<p> Background processes or callbacks initiated by your Lambda function that did not complete when the function ended resume if AWS Lambda chooses to reuse the Execution Context.</p>
\n<cite>\n<p>AWS docs: Lambda execution environment</p>\n</cite>\n</blockquote>
\nAs well as this somewhat hidden message:
\n<blockquote>\n<p>When the callback is called (explicitly or implicitly), AWS Lambda continues the Lambda function invocation until the event loop is empty.</p>
\n<cite>\n<p>AWS docs: Lambda function handler in Node.js</p>\n</cite>\n</blockquote>
\nLooking further, there's some documentation about the context object. Specifically about a property called callbackWaitsForEmptyEventLoop
. This is what it does:
<blockquote>\n<p>The default value is true
. This property is useful only to modify the default behavior of the callback. By default, the callback will wait until the event loop is empty before freezing the process and returning the results to the caller.</p>
<cite>\n<p>AWS docs: Lambda context object in Node.js</p>\n</cite>\n</blockquote>
\nOkay, so with this information we can make sense of what happened when we executed the code in timeout.js
before. Lets break it down and go over it step by step.
timeout.js
. The call stack is empty.main
is called, and pushed onto to the call stack:\"use strict\"\n\nfunction timeout(ms) {\n console.log(\"timeout start\")\n\n return new Promise((resolve) => {\n setTimeout(() => {\n console.log(`timeout cb fired after ${ms} ms`)\n resolve()\n }, ms)\n })\n}\n\nasync function main() {\n console.log(\"main start\")\n timeout(5e3)\n console.log(\"main end\")\n}\n\nexports.handler = main\n
\n\nmain
, console.log(\"main start\")
is called, and pushed onto the call stack:\"use strict\"\n\nfunction timeout(ms) {\n console.log(\"timeout start\")\n\n return new Promise((resolve) => {\n setTimeout(() => {\n console.log(`timeout cb fired after ${ms} ms`)\n resolve()\n }, ms)\n })\n}\n\nasync function main() {\n console.log(\"main start\")\n timeout(5e3)\n console.log(\"main end\")\n}\n\nexports.handler = main\n
\n\nconsole.log
executes, prints main start
, and is popped off the call stack.main
continues executing, calls timeout(5e3)
, and is pushed onto the call stack:\"use strict\"\n\nfunction timeout(ms) {\n console.log(\"timeout start\")\n\n return new Promise((resolve) => {\n setTimeout(() => {\n console.log(`timeout cb fired after ${ms} ms`)\n resolve()\n }, ms)\n })\n}\n\nasync function main() {\n console.log(\"main start\")\n timeout(5e3)\n console.log(\"main end\")\n}\n\nexports.handler = main\n
\n\ntimeout
, console.log(\"timeout start\")
is called, and pushed onto the call stack:\"use strict\"\n\nfunction timeout(ms) {\n console.log(\"timeout start\")\n\n return new Promise((resolve) => {\n setTimeout(() => {\n console.log(`timeout cb fired after ${ms} ms`)\n resolve()\n }, ms)\n })\n}\n\nasync function main() {\n console.log(\"main start\")\n timeout(5e3)\n console.log(\"main end\")\n}\n\nexports.handler = main\n
\n\nconsole.log
executes, prints timeout start
, and is popped off the call stack.timeout
continues executing, calls new Promise(callback)
on line 6, and is pushed onto the call stack:\"use strict\"\n\nfunction timeout(ms) {\n console.log(\"timeout start\")\n\n return new Promise((resolve) => {\n setTimeout(() => {\n console.log(`timeout cb fired after ${ms} ms`)\n resolve()\n }, ms)\n })\n}\n\nasync function main() {\n console.log(\"main start\")\n timeout(5e3)\n console.log(\"main end\")\n}\n\nexports.handler = main\n
\n\nnew Promise(callback)
executes, it interacts with the Promise
API and passes the provided callback to it. The Promise
API sends the callback to the task queue and now must wait until the call stack is empty before it can execute.new Promise
finishes executing, and is popped of the call stack.timeout
finishes executing, and is popped off the call stack.main
continues executing, calls console.log(\"main end\")
, and is pushed onto the call stack:\"use strict\"\n\nfunction timeout(ms) {\n console.log(\"timeout start\")\n\n return new Promise((resolve) => {\n setTimeout(() => {\n console.log(`timeout cb fired after ${ms} ms`)\n resolve()\n }, ms)\n })\n}\n\nasync function main() {\n console.log(\"main start\")\n timeout(5e3)\n console.log(\"main end\")\n}\n\nexports.handler = main\n
\n\nconsole.log
executes, prints main end
, and is popped off the call stack.main
finishes executing, and is popped off the call stack. The call stack is empty.Promise
callback (step 9) can now be scheduled by the event loop, and is pushed onto the call stack.Promise
callback executes, calls setTimeout(callback, timeout)
on line 7, and is pushed onto the call stack:\"use strict\"\n\nfunction timeout(ms) {\n console.log(\"timeout start\")\n\n return new Promise((resolve) => {\n setTimeout(() => {\n console.log(`timeout cb fired after ${ms} ms`)\n resolve()\n }, ms)\n })\n}\n\nasync function main() {\n console.log(\"main start\")\n timeout(5e3)\n console.log(\"main end\")\n}\n\nexports.handler = main\n
\n\nsetTimeout(callback, timeout)
executes, it interacts with the setTimeout
API and passes the corresponding callback and timeout to it.setTimeout(callback, timeout)
finishes executing and is popped of the call stack. At the same time the setTimeout
API starts counting down the timeout, to schedule the callback function in the future.At this point the call stack and task queue are both empty. At the same time a timeout is counting down (5 seconds), but the corresponding timeout callback has not been scheduled yet. As far as Lambda is concerned, the event loop is empty. So it will freeze the process and return results to the caller!
\nThe interesting part here is that Lambda doesn't immediately destroy its execution context. Because if we wait for +5 seconds and run the Lambda again (like in the second run) we see the console message printed from the setTimeout
callback first.
This happens because after the Lambda stopped executing, the execution context was still around. And after waiting for +5 seconds, the setTimeout
API sent the corresponding callback to the task queue:
When we execute the Lambda again (second run), the call stack is empty with a message in the task queue, which can immediately be scheduled by the event loop:
\n\nThis results in timeout cb fired after 5000 ms
being printed first, because it executed before any of the code in our Lambda function:
Obviously this is undesired behavior and you should not write your code in the same way we wrote the code in timeout.js
.
Like stated in the AWS docs, we need to make sure to complete processing all callbacks before our handler exits:
\n<blockquote>\n<p>You should make sure any background processes or callbacks (in case of Node.js) in your code are complete before the code exits.</p>
\n<cite>\n<p>AWS docs: Lambda execution environment</p>\n</cite>\n</blockquote>
\nTherefore we'll make the following change to the code in timeout.js
:
- timeout(5e3);\n+ await timeout(5e3);\n
\nThis change makes sure the handler function does not stop executing until the timeout
function finishes:
\"use strict\"\n\nfunction timeout(ms) {\n console.log(\"timeout start\")\n\n return new Promise((resolve) => {\n setTimeout(() => {\n console.log(`timeout cb fired after ${ms} ms`)\n resolve()\n }, ms)\n })\n}\n\nasync function main() {\n console.log(\"main start\")\n await timeout(5e3)\n console.log(\"main end\")\n}\n\nexports.handler = main\n
\nWhen we run our code with this change, all is well now.
\n\nI intentionally left out some details about the the task queue. There are actually two task queues! One for macrotasks (e.g. setTimeout
) and one for microtasks (e.g. Promise
).
According to the spec, one macrotask should get processed per tick. And after it finishes, all microtasks will be processed within the same tick. While these microtasks are processed they can enqueue more microtasks, which will all be executed in the same tick.
\nFor more information see this article from RisingStack where they go more into detail.
\nThis post was originally published on Medium.
\n","date_published":"2019-05-30T00:00:00.000Z","date_modified":"2023-02-03T00:00:00.000Z","tags":["aws","event-loop","lambda","nodejs"]}]}}