Building your own
Learn how to define custom decoders using .refine() and .transform().
When defining new decoders, it's important to understand the difference between what
values it accepts vs which values it returns. In many cases these are the same.
This can make it confusing to notice there even is a difference. For example, the
string decoder accepts strings, and also returns (those same) strings.
This isn't always automatically the case, though. Some random examples:
| This decoder... | ...accepts | ...but returns | ...so its type is |
|---|---|---|---|
string | strings | strings | Decoder<string> |
email | strings | strings | Decoder<string> |
number | numbers | numbers | Decoder<number> |
integer | numbers | numbers | Decoder<number> |
isoDate | strings | Date instances | Decoder<Date> |
url | strings | URL instances | Decoder<URL> |
truthy | anything! | booleans | Decoder<boolean> |
From the type definition, you can always tell what the decoder will return. You cannot tell from the type what input values it will accept. You'll need to read the documentation or look at the implementation to know which values are going to get accepted or rejected.
Defining a new decoder
The easiest way to define a new decoder, is to define it in terms of an existing one that already accepts (at least) all of the values you want your new decoder to accept, and then narrow it down.
The tl;dr is:
- Start from an existing decoder that already accepts (at least) all inputs that you want to be accepting
- Optionally, narrow down what will get accepted by adding extra criteria with
.refine() - Optionally, to change what your custom decoder returns, use
.transform()
Now, let's build a few custom decoders to illustrate the above.
Example 1: a "max length" string
To build a decoder accepting strings up until a maximum number of characters, use the
built-in sized() helper:
| Input | Result |
|---|---|
| "Hello world!" | |
| "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx..." [truncated] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Too long, must be at most 280 chars | |
| 123 ^^^ Must be string |
The sized() helper also supports min, or an exact size.
Example 2: a truncating "max length" string
The example above will reject strings that are too long. Suppose you don't want to
reject those strings, but instead just chop them off at the given max length. In that
case, you want to not change what values it accepts, but what values it returns. You
do that by adding a .transform() function:
| Input | Result |
|---|---|
| "hello" | |
| "long strings will trunc" | |
| 123 ^^^ Must be string |
Compared to example 1, you can see how the "lorem ipsum" string now gets accepted (but returned in truncated form).
Example 3: Accepting Wordle words
Suppose you want to build your own Wordle clone. At the boundary of your program, you'll want to enforce that these are 5-letter words with only alphabetical letters.
We want to build this as a decoder that:
- Accepts strings containing exactly 5 alphabetical characters
- Returns those in uppercased form (even when not inputted as such)
To define the acceptance criteria, we'll use a regex() decoder. This will enforce
that only alphabetical chars are used, and that there are exactly 5 of them (it's
important that the regex pattern is anchored using ^ and $ for that). Also, we'll
accept case-insensitive input with the i flag.
Then, we'll transform any accepted words to uppercase automatically.
| Input | Result |
|---|---|
| "SWEET" | |
| "SPACE" | |
| "Sp@cE" ^^^^^^^ Must be 5-letter word | |
| "hi" ^^^^ Must be 5-letter word |
Example 4: Making a transformation reusable
The wordle example above will uppercase the output before returning it. Suppose that you
want to use that on other string decoders as well. Do you all just stick
.transform() after those?
You can define this as a higher-order decoder which works for any string decoder (aka
Decoder<string>). Simply define this as a function if you want to make writing these
easier:
function lower(decoder) {
return decoder.transform((s) => s.toLowerCase());
}
function upper(decoder) {
return decoder.transform((s) => s.toUpperCase());
}Then you can use reuse them anywhere in your decoders.
| Input | Result |
|---|---|
| { "email": "user@example.org", "labels": [ "EASY" ] } | |
| { "email": "invalid@email", ^^^^^^^^^^^^^^^ Must be email "labels": [ "ok", ], } |
Example 5: Sanitizing messy inputs
While I would not recommend going overboard with this, you can perform light parsing to clean up messy inputs. For example, if you have to handle messy data from an incoming webhook that you have no control over, you can use a decoder at the boundary to not only validate those inputs, but also to tidy things up in the same pass.
For example, suppose you have an incoming webhook that looks like this:
{
"events": [
{ "id": 1, "created_at": "2022-02-01T08:12:29Z", "labels": "urgent, delayed" },
{ "id": 2, "created_at": null, "labels": "" },
{ "id": 3, "labels": null }
]
}Suppose we want to clean up some data on the way in:
- We store all
ids as strings internally, so we'll want to transform those numeric IDs - The
created_atfield (an ISO8601-formatted string) is sometimesnulland sometimes missing completely. When it's missing, we'll want to treat it as if it wasnull - The
labelsargument represents a list of tags we'll want to treat as structural data, so we'll want to convert these values to an array of strings
Handling the id field
We'll want to look at the id field as containing an "ID" data type, not a number. We can
define an id decoder for this:
const id: Decoder<string> = either(
string,
positiveInteger.transform((n) => String(n)),
);This decoder will play nicely if ever this vendor will switch to id strings in the
future.
Handling the created_at field
We'll want to look at the created_at field as a Date | null value. So let's use the
following decoder:
nullish(isoDate, null);Wait, why not use nullable(isoDate) here?! The reason is the third event in the example.
Because the field can legally be missing, we'll have to explicitly accept both undefined
and null inputs. That's what the nullish() decoder does! Its second argument is a
convenience default value that nullish values will get normalized to.
Handling the labels field
We'll want to look at the labels field as an array of strings, but we're given a string.
(A potentially null or empty string, even.)
We can build a comma-separated helper decoder like so:
const commaSeparated = string.transform((s) => s.split(',').filter(Boolean));Putting it all together
Using the helper decoders defined above, we can put it all together this way:
const eventsDecoder = object({
events: array(
object({
id,
created_at: nullish(isoDate, null),
labels: nullable(commaSeparated, []),
}),
),
});
eventsDecoder.verify(/* JSON example from above */);
// => {
// events: [
// {
// id: '1',
// created_at: new Date('2022-02-01T08:12:29Z'),
// labels: ['urgent', 'delayed'],
// },
// { id: '2', created_at: null, labels: [] },
// { id: '3', created_at: null, labels: [] },
// ],
// }A note on naming
To make decoders maximally useful, refrain from naming decoders after the field they're
used for. Think of a decoder as the description of a data type, and name them
accordingly. That's why the decoder in the example above is
called commaSeparated and not labelsDecoder or something like that! In the case of the
id field, it happens to be also named id because it makes sense to think of it as an
ID data type. That the field also happens to be named id is a coincidence.
Keep edge cases outside your decoders
Try to keep nulls, undefineds, or other edge cases outside of the decoder as much as
possible, and instead wrap them in nullable()s where you use them in your call
sites - typically in those big object() decoders.
Take the example above. It would be easy to let the
commaSeparated decoder be infected with the null-case and handle it too. But this is
less composable. Keeping the null-case outside of that decoder makes it a smaller, and
thus a more reusable, building block. It's cheap to wrap it in a nullable() where
you put it all together.