While this would seem to create value for the author of the text by saving them time in authorship, what is the impact, and the cost, on the readers of the text? The following questions jump to mind:
What is the cost of the additional time spent reading?
For example: a multi-paragraph email that might have been one or two sentences without GenAI augmentation.
How do we avoid implicitly punishing people who take the time to write meaningful texts "by hand"?
Imagine that it takes thirty minutes to write one detailed, short, and meaningful document, but you can generate a similar but longer and less relevant update using an LLM in just five minutes. Assuming we judge people by "productivity," this penalizes the person who takes the time to do a good job and rewards the one who sends an email filled with junk an LLM barfed up.
How do we ensure important human-written "needles" aren't lost as the "haystack" of text grows?
LLMs will surely increase the volume of text we are expected to read, but the time we can devote to reading is fixed. This is likely to result in less more "skimming" and more skipping, i.e. simply not reading things. This increases the risk of readers missing important pieces of information.
How do we ensure we don't use LLMs to create problems we then need LLMs to solve?
"LLMs can summarize email threads and documents for you!" If the reason I need the document summarized is because the author created an overly-long document using an LLM, then GenAI's contribution to that interaction will have been all cost, no value.1

As long as we keep the bar for quality high, insist that communications be meaningful and relevant, and have a feedback mechanism to let people know if their communications need improvement so they can correct course, then there's absolutely no problem with using LLMs. At the moment, however, it's not clear to me that we apply these standards of quality & "high signal-to-noise ratio" to internal communications–there's generally no consequence to writing excessively long emails or documents, because why would you criticize someone for taking the time to write things out? However, this calculus changes when it no longer requires time or effort to produce reams of text.
In my view, what's needed to keep LLM-spam from flooding our lives is raising the bar for written communication and cracking down on long-winded, irrelevant fluff in emails and other documents. Ultimately, as long as people are producing high quality work it doesn't matter where it came from.
If you have a plan for how we can avoid drowning in LLM slurry, please shoot me a note and I'll include it below!
1 Personally, I am prejudiced against reading prose generated by an LLM. If it wasn't worth your time to write the email, why the heck should I waste my time reading it? ⤴
]]>How much can running a Kubernetes cluster cost? As an single example, a friend at a mid-sized company I spoke with recently was spending 250,000 US dollars per month running applications in Kubernetes (not including storage & network costs!). A company like this has 250,000 reasons (per month) to pay attention to resource usage and work on reducing it. But how?
How do you write about optimizing Kubernetes clusters without getting into the weeds? The whole thing is just weeds.
Basically, every sentence in this post has one or more caveats. I have chosen to omit the "except in the following cases..." and "as long as the following is true..." statements. To get a more realistic picture, close your eyes and imagine a Kubernetes expert saying, "actually it's a bit more complicated than that" after every sentence.
But fear not! There are some "rules of thumb" that can help you realize significant savings without having to become a Kubernetes expert.
ℹ️ Skip this section if you're already familiar with how requests, limits, and HPAs work.
In a Kubernetes Cluster, each "workload" (for example, a web server application) runs in one or more pods (generally a wrapper around a single docker container). In a Kubernetes "deployment," we tell Kubernetes how many instances (pods) of our application to run, and what sort of resources we expect each of those pods to need (requests). We can also tell Kubernetes not to let a pod exceed a certain resource usage limit (limits) and to shut down any pod that does.
For example, we can tell Kubernetes, "Hey Kubernetes, run 3 Nginx instances; I expect each will use 1 CPU and 512Mi of RAM, but do not let it use more then 1Gi of RAM."
apiVersion: apps/v1
kind: Deployment
name: my-nginx
spec:
replicas: 3 ⭐️ Run three instances
template:
spec:
containers:
- name: my-nginx
image: nginx:1.14.2
resources:
requests: ⭐️ request = "This is how much my application uses typically"
cpu: 1000m 1000 "millicpus" = 1 CPU
memory: 512Mi
limits: ⭐️ limit = "Don't let the pod use more than this"
memory: 1Gi
Telling Kubernetes an exact number of pods to run isn't very auto-scale-y, though, is it? We want more pods when we need them and fewer when we don't. We can do this by using a Horizontal Pod Autoscaler (HPA), which automatically increases and decreases the number of pods a deployment has.
How does the HPA know when to add or remove pods? Good question! The simplest way is for it to look at the average CPU usage across the pods in the deployment. When average CPU usage gets too high, add pods. When it gets too low, shut some pods down.
"Hey Kubernetes, run an HPA to scale the deployment we just made above. If the average CPU usage is well above 90%, bring more pods online. If it's much lower, shut some down. Also, don't let the total number of pods go below 2 or above 10."
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
spec:
⭐️ Scale up or down to keep the pods running ~90% CPU utilization
targetCPUUtilizationPercentage: 90
maxReplicas: 10 ⭐️ No more than 10 pods
minReplicas: 2 ⭐️ No fewer than 2 pods
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-nginx
The key to optimizing our resource consumption is to run our pods as "hot" as possible, i.e. use all available CPU & memory. In an ideal world, this would mean allocating exactly as much memory as needed and no more, and consuming approximately 100% of the CPU requested.
So why not tell the HPA to do this ("Scale up to infinity, scale down to 0")? We could do this thus:
targetCPUUtilizationPercentage: 100minReplicas: 1maxReplicas: 100000requests.cpu: 1m("one millicpu" i.e. 1/1000th of a CPU)requests.memory 1 MiWe've done it! Automatic scaling from 0 to inifinity with 100% efficiency––problem solved!
Unfortunately, It's Not That Simple...
There are a couple reasons:
The machines in a Kubernetes cluster are called "Nodes."
The requests you set for your containers tell Kubernetes how much CPU/RAM your application claims it needs to operate; Kubernetes uses this information to "schedule" your container to a node that can meet those requirements. It finds a node (physical or virtual machine) with the resources available to meet your container's requests and starts the container in that node.
If you only request cpu: 1m (one millicpu) for your application and Kubernetes has a node with 1 CPU (one thousand millicpu), Kubernetes will say "aha! This node can safely fit one thousand instances of your application!" When your 1000 pods start up on that node and each one actually wants 100m rather than 1m, that Kubernetes node will not be able to meet the pods' demands on that node. The pods will run really slowly or crash repeatedly, and Kubernetes will do nothing about it.
For this reason, it's important to tell Kubernetes roughly how much CPU & RAM your application needs so it can "set aside" an adequate amount.
Assume each instance of your application can serve 1000 requests per second (rps). Your traffic is steady around 10,000rps, and you have exactly 10 pods handling it, all of them running at 100% CPU (maximum efficiency, baby!!).
All of a sudden traffic increases to 15,000rps. No problem, bring another 5 pods online.
Please wait 2-3 minutes to bring additional pods online
Two to three minutes?! Uh-oh! What are you going to do with that extra load for two to three minutes? Latency goes up, overloaded pods start to crash, and you've got an incident on your hands.
To avoid this problem, your deployments should have enough buffer (extra resources allocated) to handle spikes in traffic for two or three minutes. This means setting your HPA's target CPU utilization far enough below 100% to be able to absorb spikes in traffic long enough for Kubernetes to bring more pods online.
A brief review of what requests and limits mean:
requests = "please make sure this much schedulable RAM/CPU is available on a node before putting this pod on that node" (how much your application typically uses.) This is a target (e.g. p50 behavior), not a maximum.limits = "don't let my pod use beyond this amount.For a more detailed description check out this video from google, especially the first half.
With those basics out of the way, let's look at what settings you can change to Improve Your Efficiency:
requests.cpu and limits.cpu)Very High Level/General Goal: looking at a Grafana chart of pod resource usage for a specific deployment such as the example below, you want the "used" line (blue) to sit at or above the "requested" line (green). (NB: the particulars of this chart are specific to one cluster, but they are generally derived from cadvisor)

Used should be at or above requested?! That doesn't make sense! How do I use more than I requested?
It's complicated but basically there's usually extra CPU available on the node beyond the total amount of CPU requested. As explained to me by a smart Kubernetes experts:
Remember, a pod is not a virtual machine with a fixed amount of physical CPU and memory; it is a group of "containerized processes" that run on a shared virtual machine with other pods.
Anyway just try to make the blue line sit at or above the green line.
Other Rules of thumb:
request above 1 aka "1000m" (unless you have an application specifically designed to make use of multiple cores) more info under "CPU"request below 1: Assuming your workload is CPU bound (a single process can scale up 'til it runs out of CPU) it probably doesn't make sense to put this below 1 CPU. Instead, consider increasing the targetCPUUtilizationPercentage on the HPA (see below).limit to 1.5 times whatever the request is (e.g. requests.cpu: 1;limits.cpu: 1.5)requests.memory and limits.memory)Memory you must be a bit more careful with. If a container doesn't have quite enough CPU it may run slowly. If it requires more memory than allowed by it's limit, it will be killed.
Very High Level/General Goal: looking at the graph of RAM usage like the one below, you want the blue line (used) to sit at the green line (requested). (It's OK for "used" to go over "requested" for short periods of time but not all the time.)

Rules of thumb:
requests at or just slightly above what you typically observe a container to uselimits above the request value to give your pod some room to handle periods of time where it needs a bit more memory while protecting the node from any one process attempting to use all of the available memory on the node. Without a RAM limit, the Kubernetes controller will not prevent a container from using 100% of the RAM on node, which would cause all other pods on the node to crash."How many pods do I need running during overnight hours?"
minReplicas value) to handle any unexpected load."Beyond what point of scaling is my deployment obviously malfunctioning and running out of control?"
The maxReplicas setting allows you to throttle your deployment's horizontal scaling in order to control resource consumption. If yours is a business-to-consumer company, you probably don't want to do this with customer-facing services. If you were to run a TV ad that spiked traffic and website deployment wants to jump from 250 pods up to 650, you probably want to let it do that. Availability (Goal 2) is typically much more important than platform cost (Goal 1) when it comes to serving customers.
For this reason, Maximum Replics should be set to whatever the highest number of pods you've observed your deployment needing, plus 50 or 100%. Increasing this number doesn't cost anything directly.
Another consideration is resource constraints of service(s) your service connects to. For example, if your database can only handle a maximum of 1000 connections you probably want to set your max application replicas so that you do not exceed that capacity.
This is one of the most important settings, especially for large clusters. You want this value to be as close to 100% as possible: the "hotter" your pods run, the less idle CPU you're paying for.
On the other hand, if you have _no_ idle CPU, you are unlikely to be able to handle a spike in traffic (Goal 3) while waiting for the deployment to scale up.
requests.cpu and targetCPUUtilizationPercentageEarlier we said "look at CPU efficiency and reduce requests.cpu if you are requesting more than you're using." That advice can be misleading when you're using an HPA that scales your cluster up and down based on average CPU utilization.
Example: If it looks like you're only running at 70% CPU efficiency, you may think "requests.cpu is 30% higher than needed, I should turn it down." But if targetCPUUtilizationPercentage it's set to 70, your CPU isn't overprovisioned, the deployment is just getting scaled up every time the average CPU across pods in the deployment goes much over 70%*, so the average usage always hovers around 70%! If your pods each request 1 CPU, the deployment will scale so they each use approximately 70% of 1 CPU. If your pods each request 600m, the deployment will scale so they each use approximately 70% of 600m.
Turn down requests.cpu all you want, utilization will continue to hover at 70%. In this case, you would need to increase targetCPUUtilizationPercentage, not decrease requests.cpu in order to increase efficiency.
That's the $64 question. The formula for this is basically:
100% - (however much buffer you need to handle traffic spikes for the time it takes additional pods to come online and be ready to serve).
For more information on how to set this, see two places in this post:
As mentioned above, The interplay between requests.cpu and hpa.targetCpuUtilizationPercentage can be hard to grok, but you should make sure you consider both settings when setting either. As a rule of thumb, increase targetCpuUtilizationPercentage as much as possible before reducing requests.cpu. It's hard to know much CPU your container can make use of if it's being aggressively throttled by a low targetCpuUtilizationPercentage.
You'll know you've set targetCpuUtilizationPercentage too high if response latency starts to climb when you get traffic spikes (or if your pods start crashing and you have an outage 😄).
There's no need to make big infrastructure changes all at once, and in many ways it's inadvisable! Instead of turning targetCpuUtilizationPercentage from 70 to 95, consider taking a stepwise approach: step it up to 80 and observe for a few days or a week, then try 90, observe a while, rinse and repeat. This is a safe and easy way to get started if you're not sure what settings are best!
Hopefully these "rules of thumb" give you enough information to get started right-sizing your deployments. There is of course more fine-tuning you can do, but following these rules should get you at least 50% of the way there–there's lots of low-hanging fruit!
If you want to read more about this fascinating topic, please see the links below.
1.3gb for a web app?! The size of your Docker image is getting out of control!
Uh-oh... The infrastructure team is calling you out for your Docker image size! Larger images means...
All of these are small problems but they add up! So your image is too big–don't panic! Following a few simple steps, you can cut your Docker image down to size in next to no time.
*this post assumes you are running Docker images in Kubernetes.
How big is your image? Assuming you've run docker build to build your image locally, this is easy to check with docker images:
➜ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
gcr.io/ns-1/toodle-app d82c28d e4f0fd00de6d 4 months ago 1.32GB
gcr.io/ns-2/go-af v0.12.1 d665db43eb95 4 months ago 911MB
Our toodle-app image is 1.32 GB. But why is it so big? To figure that out, we'll use a handy tool called dive to analyze the image layer by layer.
➜ dive gcr.io/ns-1/toodle-app:d82c28d
Image Source: docker://gcr.io/ns-1/toodle-app:d82c28d
Fetching image... (this can take a while for large images)
When it completes it will show a view like this:

There's a lot going on here!
Use the arrow keys to navigate up and down in the currently selected pane. Use tab to switch from the Layers pane to Current Layer Contents and back. Here I've pressed the down arrow several times to get to the 309 MB RUN make build/bin/server layer, then used tab to switch focus to the Current Layer Contents panel:

By default, the Current Layer Contents shows you a full tree of the filesystem up to and including the selected layer. What's typically more useful when analyzing your image size by layer is to see what files were added by that layer. Use ctrl+u (see "^U Unmodified" in the bottom right of the screenshot) to toggle that option off, which hides files unmodified by the current layer. This leaves visible only files that were Added, Removed, or Modified by this layer:

Hello, what's this–this layer (which runs go build to build the actual toodle-app binary) add 309MB, but 237MB of that is go mod cache, which we do not need after the binary has been built!
Now we know why this layer is larger than it should be and we can see about cleaning it up (we'll do this below). Repeat the process for other large layers, or just poke around and see what each layer is adding or modifying.
Now that we know how to figure out why it's big, let's look at some strategies to cut down an image's size...
When we build a project inside a docker image, each of the things we pull or copy into that image falls into one of two categories:
Some of the things we add to our toodle-app image, above:
make: needed to build the applicationgcc: needed to build the applicationnginx: needed to run the application./build/client/strings: needed to run the applicationbuild/bin/server binary we create: needed to run the applicationThe stuff we need only at build time (make, gcc, etc.) does not need to be shipped as part of the image because it is not needed at runtime. We could uninstall make gcc etc. after running the build, but there is an even cleaner way: create one image just for building the application and one image just for running the application.
This has become a common pattern, and there are two ways to do this:
With this approach you have one "builder" image and a separate "runtime" image. From a high level:
Dockerfile.builder Dockerfile defines your "builder" image. This builds an image based on....Dockerfile contains only runtime dependenciesYour CI step (e.g. on Google Cloud Build) loads the "Builder" image and runs docker inside that image to produce your runtime image.
Multi-Stage Builds vastly simplify this process! A multi-stage docker file has multiple FROM commands, the first one for the "builder" and the second one for the "runtime." Basically you install all the build dependencies in your builder, run your build, then in the runtime build you COPY the build artifact into your runtime image which you can then deploy.
# Base image for our "builder" contains the go binary which we
# do NOT need at runtime (only to build the server application binary)
FROM golang:1.7.3 AS sequoiasbuilder
WORKDIR /tmp/foo
COPY src/main.go . # copy from host into builder
# build our go binary
go build -o my-application ./main.go
# The second FROM is a new image!
# (our "runtime" image)
FROM alpine:latest # using a stripped down linux (no go!)
WORKDIR /root/
# This has _nothing_ from the builder unless we copy it in
COPY --from=sequoiasbuilder /tmp/foo/my-application .
CMD ["./my-application"]
Now only those things necessary for runtime will be shipped to kubernetes, and the go binary (and all the go modules that go build pulled in) etc. are discarded! Read this short article for more.
The main reason to use the "multiple dockerfiles" approach is because the underlying "builder" image can be built once and reused across many builds. But Docker image layers by default, so why would you need this? You would need this if your (CI) build environment is discarding Docker image layers after each build, as Google Cloud Platform does by default. Discard docker images after each build = build from scratch each time.
There is a simple fix for this, however: the Kaniko builder allows layers to be stored, cached, and reused.
❗️ On GCB, using Kaniko is recommended for both builder and multi-stage patterns. Read more.
Assuming you don't go the Multi-Stage route (above), or even if you did, you may be able to reduce your image size by removing stuff you don't actually need.
Did you start building your dockerfile by copying an existing one? If so, perhaps you have a command like this near the top
RUN apk add --no-cache make git curl bash nginx pkgconfig zeromq-dev \
gcc musl-dev autoconf automake build-base libtool python
Check that you actually need all these things! Some may be cruft from another project, or the dependency may have been replaced. This is especially important if you're building off a shared "base" image file. When using a shared base image, it's very likely that there's stuff in there you don't need. Easy money!
As we saw above using dive, the toodle-app go build was downloading and caching 237 MB of go modules, which were needed during the build but not after:
│ Current Layer Contents ├──────────────────────────────────────
Permission UID:GID Size Filetree
drwxr-xr-x 0:0 72 MB ├── mosmos
drwxr-xr-x 0:0 72 MB │ └── toodle-app
drwxr-xr-x 0:0 72 MB │ └── build
drwxr-xr-x 0:0 72 MB │ └── bin
-rwxr-xr-x 0:0 72 MB │ └── server
drwx------ 0:0 237 MB └── root
drwxr-xr-x 0:0 237 MB └── .cache
drwxr-xr-x 0:0 237 MB └── go-build
The following change fixed this problem in toodle-app:
- RUN make build/bin/server
+ RUN make build/bin/server && go clean -cache
Other examples of this are removing gcc/make/webpack or removing dev-dependencies for a JavaScript project.
You may have static assets in your image that rarely change and are not actually needed within the application. For example, the toodle-app image contains various reports and media assets:
-rw-r--r-- 0:0 12 MB ├── MarketReport.pdf
-rw-r--r-- 0:0 12 MB ├── EconReport.pdf
-rw-r--r-- 0:0 34 MB ├── Toodle-MediaKit.zip
drwxr-xr-x 0:0 4.3 MB ├── press-releases
It's not huge, but this is 62MB that gets pulled by the Kubernetes controller for every deployment and copied into every container (the image upon which this post is based was running on 268 containers at the time of writing), all of which need garbage collection... it adds up!!
Making your images smaller is easy, it improves infrastructure performance and it saves money. What's not to like? If you've got more tips for shaving bits off your image size, drop me a line & I'll add them below!
]]>jq is a command line tool for parsing and modifying JSON. It is useful for extracting relevant bits of information from tools that output JSON, or REST APIs that return JSON. Mac users can install jq using homebrew (brew install jq); see here for more install options.
In this post we'll examine a couple "real world" examples of using jq, but let's start with...
jq BasicsThe most basic use is just tidying & pretty-printing your JSON:
$ USERX='{"name":"duchess","city":"Toronto","orders":[{"id":"x","qty":10},{"id":"y","qty":15}]}'
$ echo $USERX | jq '.'
outputs
{
"name": "duchess",
"city": "Toronto",
"orders": [
{
"id": "x",
"qty": 10
},
{
"id": "y",
"qty": 15
}
]
}
I like this pretty-printing/formatting capability so much, I have an alias that formats JSON I've copied (in my OS "clipboard") & puts it back in my clipboard:
alias jsontidy="pbpaste | jq '.' | pbcopy"
The '.' in the jq '.' command above is the simplest jq "filter." The dot takes the input JSON and outputs it as is. You can read more about filters here, but the bare minimum to know is that .keyname will filter the result to a property matching that key, and [index] will match an array value at that index:
$ echo $USERX | jq '.name'
"duchess"
$ echo $USERX | jq '.orders[0]'
{
"id": "x",
"qty": 10
}
And [] will match each item in an array:
echo $USERX | jq '.orders[].id'
"x"
"y"
Filtering output by value is also handy! Here we use | to output the result of one filter into the input of another filter and select(.qty>10) to select only orders with qty value greater than 10:
echo $USERX | jq '.orders[]|select(.qty>10)'
{
"id": "y",
"qty": 15
}
One more trick: filtering by key name rather than value:
$ ORDER='{"user_id":123,"user_name":"duchess","order_id":456,"order_status":"sent","vendor_id":789,"vendor_name":"Abe Books"}'
$ echo $ORDER | jq '.'
{
"user_id": 123,
"user_name": "duchess",
"order_id": 456,
"order_status": "sent",
"vendor_id": 789,
"vendor_name": "Abe Books"
}
$ echo $ORDER | jq 'with_entries(select(.key|match("order_")))'
{
"order_id": 456,
"order_status": "sent"
}
(cheat sheet version: with_entries(select(.key|match("KEY FILTER VALUE"))))
Check out more resources below to learn about other stuff jq can do!
I have a prometheus metric showing up locally that doesn't look quite right:
async_task_total{task_name="/Users/duchess/charmoffensive/toodle-app/pkg/web/page/globals.go(189):(*GlobalsPopulator).Populate"} 6
The fact that the task_name value is a filename is a red flag–it's bad to have labels with high cardinality and I'm not sure how many of these there are. I want to find out:
task_name labels look like in production?At my company there is a CLI tool we'll call pquery that allows prometheus metrics to be queried from the command line, and it outputs JSON–how conventient! I use this tool in the following examples. You don't have this tool, but fear not: this wonderful post explains how to query prometheus using curl which is essentially what pquery does.
Using pquery we can view prometheus metrics from our various clusters. But even if we filter for this exact metric name, it's more data than we can easily look at. We'll use wc -l (wordcount: count lines) to get a rough idea of how much data we're working with:
$ pquery 'async_task_total' | wc -l
316117
316,117 lines of JSON! Oof! We want to iterate over the metrics. But what jq filter do we need to access the array of metrics? I find head useful for figuring out what the top level keys are for a large json structure:
$ pquery 'async_task_total' | head -n 20
{
"data": {
"result": [
{
"metric": {
"__name__": "async_task_total",
"app": "toodle-app-alpha",
"instance": "10.55.55.55:9393",
"job": "toodle-app-alpha",
"kubernetes_pod_name": "toodle-app-b446b7ccd-6mls6",
"namespace": "noweb",
"netpol": "toodle-app",
"node_name": "gke-production-04-3455c6df-j526",
"release": "toodle-app",
"task_name": "/charmoffensive/toodle-app/pkg/core/user/user.go(67):GetAccountDetails"
},
"value": [
1600981630.344,
"2"
You can also use jq 'keys' if you just want the key names:
$ pquery 'async_task_total' | jq 'keys'
[
"data",
"status"
]
Anyway we can see from above that .data.result is the "filter" path for the metrics themselves. Let's get the first result ([0]) of this array so we can see what one metric looks like:
$ pquery 'async_task_total' | jq '.data.result[0]'
{
"metric": {
"__name__": "async_task_total",
"app": "toodle-app-alpha",
"instance": "10.55.55.55:9393",
"job": "toodle-app-alpha",
"kubernetes_pod_name": "toodle-app-b446b7ccd-6mls6",
"namespace": "noweb",
"netpol": "toodle-app",
"node_name": "gke-production-04-3455c6df-j526",
"release": "toodle-app",
"task_name": "/charmoffensive/toodle-app/pkg/core/user/user.go(67):GetAccountDetails"
},
"value": [
1600981906.069,
"2"
]
}
Oops! That app value (toodle-app-alpha) indicates a mistake: I'm only interested in results from the toodle-app app, not from other apps that may also emit this metric (such as the alpha deployment we see here). We could select for this using jq, but promql already lets us filter by metric names so we'll do that instead: pquery 'async_task_total{app="toodle-app"}'.
We're interested in the task_name value in the metric object, so let's pluck that from each item in the array above:
$ pquery 'async_task_total{app="toodle-app"}' \
| jq '.data.result[].metric.task_name'
"/charmoffensive/toodle-app/pkg/core/guides/guides.go(411):generateGuideFromDefinition"
"/charmoffensive/toodle-app/pkg/core/place/place.go(122):FetchPlaceDetailForCollection"
"/charmoffensive/toodle-app/pkg/core/place/place.go(132):FetchPlaceDetailForCollection"
"/charmoffensive/toodle-app/pkg/core/user/user.go(67):GetAccountDetails"
"/charmoffensive/toodle-app/pkg/core/user/user.go(73):GetAccountDetails"
"/charmoffensive/toodle-app/pkg/web/page/area.go(160):(*areaView).fetchData"
"/charmoffensive/toodle-app/pkg/web/page/area.go(166):(*areaView).fetchData"
"/charmoffensive/toodle-app/pkg/web/page/area.go(172):(*areaView).fetchData"
"/charmoffensive/toodle-app/pkg/web/page/area_category.go(140):(*areaCategoryView).fetchData"
"/charmoffensive/toodle-app/pkg/web/page/area_category.go(146):(*areaCategoryView).fetchData"
{... + 18009 more lines}
📝 Update: It was pointed out to me that as this is a post about
jq, not aboutpromql, ajqsolution is more appropriate here. I'd originally used promql because it's more efficient to filter on the server when possible. Here's thejqversion which uses theselectfilter:$ pquery 'async_task_total' \ | jq '.data.result[].metric | select(.app == "toodle-app").task_name'Back to the post...
Eighteen thousand values for that label!? That's bad!! But wait a tic–if other labels are varying, some of these may actually be duplicates. Let's sort them and see:
$ pquery 'async_task_total{app="toodle-app"}' \
| jq '.data.result[].metric.task_name' | sort | head -n10
"/charmoffensive/toodle-app/pkg/core/collection/resolvers/query.go(221):(*queryResolver).Verticals"
"/charmoffensive/toodle-app/pkg/core/collection/resolvers/query.go(221):(*queryResolver).Verticals"
"/charmoffensive/toodle-app/pkg/core/collection/resolvers/query.go(221):(*queryResolver).Verticals"
"/charmoffensive/toodle-app/pkg/core/collection/resolvers/query.go(221):(*queryResolver).Verticals"
"/charmoffensive/toodle-app/pkg/core/collection/resolvers/query.go(221):(*queryResolver).Verticals"
"/charmoffensive/toodle-app/pkg/core/collection/resolvers/query.go(221):(*queryResolver).Verticals"
"/charmoffensive/toodle-app/pkg/core/collection/resolvers/query.go(221):(*queryResolver).Verticals"
"/charmoffensive/toodle-app/pkg/core/guides/guides.go(411):generateGuideFromDefinition"
"/charmoffensive/toodle-app/pkg/core/guides/guides.go(411):generateGuideFromDefinition"
"/charmoffensive/toodle-app/pkg/core/guides/guides.go(411):generateGuideFromDefinition"
Yep: most of these are actually not unique names. uniq to the rescue!
$ pquery 'async_task_total{app="toodle-app"}' \
| jq '.data.result[].metric.task_name' | sort | uniq
"/charmoffensive/toodle-app/pkg/core/collection/resolvers/query.go(221):(*queryResolver).Verticals"
"/charmoffensive/toodle-app/pkg/core/guides/guides.go(411):generateGuideFromDefinition"
"/charmoffensive/toodle-app/pkg/core/place/place.go(122):FetchPlaceDetailForCollection"
"/charmoffensive/toodle-app/pkg/core/place/place.go(132):FetchPlaceDetailForCollection"
"/charmoffensive/toodle-app/pkg/core/user/user.go(67):GetAccountDetails"
"/charmoffensive/toodle-app/pkg/core/user/user.go(73):GetAccountDetails"
"/charmoffensive/toodle-app/pkg/web/page/area.go(160):(*areaView).fetchData"
"/charmoffensive/toodle-app/pkg/web/page/area.go(166):(*areaView).fetchData"
"/charmoffensive/toodle-app/pkg/web/page/area.go(172):(*areaView).fetchData"
"/charmoffensive/toodle-app/pkg/web/page/area_category.go(140):(*areaCategoryView).fetchData"
{... more}
Now I've got a full list of all the distinct values for this label, which answers my first question.
Well that's pretty easy at this point...
$ pquery 'async_task_total{app="toodle-app"}' \
| jq '.data.result[].metric.task_name' | sort | uniq | wc -l
92
Ninety-two! Not so bad. Mystery solved, and I can say with reasonable confidence "the cardinality of these labels isn't terribly high, I'm leaving this alone 😅"
Techniques and features used in this task:
-r to output raw output rather than escaped/quoted$ kubectl get deployments toodle-app -o json \
| jq '.status.conditions[]|(.reason + ": " + .message)' -r
NewReplicaSetAvailable: ReplicaSet "toodle-app-545b65cfd4" has successfully progressed.
MinimumReplicasAvailable: Deployment has minimum availability.
prometheus. Prefix$ kubectl get service toodle-app -o json \
| jq '.metadata.annotations | with_entries(select(.key|match("prometheus")))'
{
"prometheus.io/path": "/varz",
"prometheus.io/port": "9393",
"prometheus.io/scrape": "true"
}
$ cat cronjob.yaml
apiVersion: batch/v1beta1
kind: CronJob
spec:
schedule: "*/1 * * * *" # once per minute
jobTemplate:
spec:
template:
spec:
containers:
- name: deployment-scanner
image: deployment-scanner:38
$ brew install yq
$ yq '.spec.jobTemplate.spec.template.spec.containers[0].image' cronjob.yaml
"deployment-scanner:38"
I used this to build a new docker image tag each time I incremented the image value in cronjob.yaml, before applying the configuration (while I was developing a kubernetes cronjob locally):
docker build -t $(yq '.spec.jobTemplate.spec.template.spec.containers[0].image' cronjob.yaml -r) . && kubectl apply --filename=cronjob.yaml
➜ curl -sL https://postmates.com/feed | pup 'head title'
<title>
postmates: Food Delivery, Groceries, Alcohol - Anything from Anywhere
</title>
➜ curl -sL https://postmates.com/feed | pup 'head meta[charset]'
<meta charset="UTF-8">
➜ curl -sL https://postmates.com/feed | pup 'head meta[charset] json{}'
[
{
"charset": "UTF-8",
"tag": "meta"
}
]
What do you use jq or yq for? Will you be adding pup to your workflow? Sound off in the comments, which is to say "drop me a line!"
jq playground to try stuff outI needed this tutorial 6 months ago (and 6 months before that, and 6 months before that). :D Highly recommend looking at and maybe including
gronin this as a very nice complement to jq. It fills in some use cases in a very straightforward way that are pretty cumbersome in jq, such as finding a field deeply nested in an optional parent.
- heleninboodler,
Thanks helen, I didn't know about that tool & it does look quite useful! I'd probably add it into the "figuring out the structure of the data" step in the workflow described above, to complement head. Thanks for the tip!
👉 Some good discussion & lots of tips & links to similar articles on hackernews.
]]>While the spam comments are bothersome overall, I feel the need to give credit where credit is due: some of the best comments I've recieved have been spam. There have been times when I'm having a bad day, then I check my email and see something like this:
Superb, what a web site it is! This weblog provides helpful data to us, keep it up.
- similar internet page, 2017
Wow! Thank you similar internet page!! I was feeling a discouraged the day I got this, and I can't lie–this comment made me feel a bit better. I know you're only a bot, but I choose to take the encouraging words at face value, and they make me feel appreciated.
So, as a farewell to my loyal spam bots, I will feature and respond to the best comments here.
Thanks a lot for giving everyone such a brilliant chance to discover important secrets from this site. It is often very useful plus packed with a great time for me and my office co-workers to visit your web site more than three times per week to see the latest issues you will have. Not to mention, I am also always amazed for the spectacular secrets served by you. Selected 3 facts on this page are completely the simplest I've had.
- buy cial𝚒s, 2017
Hi buy cial𝚒s! Thank you for the kinds words, and please give my warm regards to your coworkers. I must say I'm surprised to hear you visit three times weekly! I'll have to start posting a lot more frequently to keep the content fresh.
What i do not understood is in reality how you're no longer really much more well-appreciated than you may be now. You're so intelligent. You recognize thus significantly in terms of this topic, produced me in my opinion consider it from so many numerous angles. Its like women and men don't seem to be fascinated until it's something to accomplish with Lady gaga! Your individual stuffs great. Always care for it up!
- https://[removed]/sildenafil-generique-forum/, 2017
Oh my goodness... this comment almost makes me tear up. I do feel underappreciated at times, and it really means a lot to me to have someone notice the care and work I put into the content here. As for why I'm not "much more well-appreciated" despite being "so intelligent" (flattery will get you nowhere!), the fact is it takes much more than intelligence or even good work to succeed. Marketing and branding are hugely important, and for better or worse, those are things I'm not terribly interested in.
It's not unlike the joke about the traveller trying to smuggle coffee into Haifa without paying the required duties. The customs official looks in his sacks and asks him what it is he's carrying. "Birdseed," the traveller replies. "Since when do birds eat coffee?" asks the incredulous customs officer.
Whereup the traveller replies with a shrug "they'll eat if they want; if they don't want to, they won't."
It's in a similar spirit that I publish my posts.
Howdy! This post could not be written any better! Looking through this post reminds me of my previous roommate! He constantly kept talking about this. I will send this article to him. Pretty sure he'll have a great read. Thanks for sharing!
- lirik lagu doa suci imam s arifin, 2017
Hi lirik lagu doa suci imam s arifin! I am happy to share, and be sure to say "hi" to your roommate for me!!
you are really a good webmaster. The web site loading speed is amazing. It seems that you are doing any unique trick. Furthermore, The contents are masterwork. you've performed a wonderful task on this topic!
- finance, 2018
Thanks finance! It's a static site served from Github pages so I am not surprised to learn it's fast! Furthermore, I wrote my own CSS (with a couple bits copy/pasted from a framework) and there's almost no JavaScript on the site. It's quite simple really, just keep it light & static. I'm glad the performance does not go unnoticed! As for the contents being a "masterwork," well, I'll have to leave that for the Nobel committee to judge.
At this moment I am going away to do my breakfast, later than having my breakfast coming over again to read other news.
- student loans, 2018
Thank you for the update student loans! I was wondering why you hadn't finished reading, but it makes sense now. Nothing is more important than your health, so (this goes for all my readers) if you find yourself excessively hungry, thirsty, or tired while reading one of my posts, take a break! It will be here when you get back, I promise.
This site was... how do you say it? Relevant!! Finally I've found something which helped me.
- buycialis, 2018
Ha! Thanks buycialis!!
You're so cool! I do not believe I have read through a single thing like this before. So great to discover someone with genuine thoughts on this topic. Seriously. many thanks for starting this up. This site is one thing that's needed on the internet, someone with a bit of originality!
- importance of education, 2019
This type of feedback is what keeps me doing this when it seems "pointless," which it does at times. In particular I appreciate what you said about "someone with a bit of originality." Seriously. Thank you importance of education!
This is really fascinating, You're an excessively professional blogger. I've joined your feed and stay up for looking for more of your excellent post. Also, I have shared your website in my social networks
- student loans, 2019
student loans! You're back!! How was breakfast? You'll be happy to know, incidentally, that I have added an RSS feed (at the request of a reader!), which you can find here.
My brother recommended I might like this blog. He was totally right. This post actually made my day. You can not imagine just how much time I had spent for this info! Thanks!
- Recommended website, 2020
You know what, Recommended website? Your comment made my day. Thank you.
Do you mind if I quote a couple of your posts as long as I provide credit and sources back to your webpage? My blog is in the very same niche as yours and my visitors would truly benefit from a lot of the information you provide here. Please let me know if this okay with you. Thanks!
- best sandwich in North Carolina
Not at all, best sandwich in North Carolina! Provided you include attribution, I've no issue with quoting my work. If you can avoid copying posts from start to finish I would appreciate it, but short of that, go ham!
Speaking of pork, I am thrilled to get comments from the Old North State, my home sweet home! Go Heels!!
With that, I'll wrap up this little tribute to my loyal and enthusiastic spam commenters. Thanks again to everyone, and if you wish to contact me it's still possible! My email address can be found on the contact page.
Happy spamming!
]]>Symbol.prototype.description. "Wow," I thought, "this feature will be really easy to misuse!" In this post, we'll look at a couple of ways you can start misusing this cutting edge JavaScript feature today!
Symbols were introduced in ECMAScript 6 (ES2015) as a way to create truly unique values in JavaScript applications. They have several cool features, but the main point of Symbols is that they are unique. Although multiple Symbols can be created with identical descriptions (e.g. x = Symbol('a'); y = Symbol('a')), the Symbols themselves are different. The description is just a helpful label, almost like a comment: it cannot be directly accessed from the Symbol once it's created.
Until ES2019! Now the Symbol's description property can be directly accessed via mySymbol.description. Why is useful? Who cares!1 This blog post is not about what's useful, it's about misusing JavaScript for pain and heartache! So without further ado,
As mentioned, Symbols are unique.2 This means if one is created by a vendor:
// vendor/x.js
catalog_id = Symbol('cat_id');
module.export = catalog_id;
...and then another by me...
// lib/y.js
cat_id = Symbol('cat_id');
module.export = cat_id;
they will be unique values:
catalog_id = require('vendor/x.js');
cat_id = require('lib/y.js');
const item = {};
item[catalog_id] = 123;
// Check if catalog id is set:
// 1. get the object keys that are symbols:
const symbolProps = Object.getOwnPropertySymbols(item);
// 2. see if that array contains catalog id
hasCatalogId = symbolProps.includes(cat_id);
hasCatalogId is false! What gives?? The Symbol I defined in lib/y.js is supposed to reference the same property as that referenced by the Symbol created in vendor/x.js! I created mine to match theirs (they have the same description). There must be a way to see that they are actually "the same"... Symbol.prototype.description to the rescue:
//... require(), const item etc.
// 1. get the DESCRIPTION of object keys that are symbols:
const symbolPropDescriptions = Object.getOwnPropertySymbols(item)
.map(symb => symb.description);
// 2. see if that array contains catalog id
hasCatalogId = symbolPropDescriptions.includes(cat_id.description);
Problem solved: hasCatalogId is now (correctly) true!
In this case, I have Symbols representing the unique roles my user's might have (author, admin, etc).
const admin = Symbol('admin');
const author = Symbol('author');
I also have a collection of users with their roles defined:
const users = [
{name: 'vimukt', role: admin},
{name: 'danilo', role: admin}
];
log(users[0].role === admin); // true
log(users[0].role.description); // "admin"
I want to serialize these for some reason:
usersJSON = JSON.stringify(users);
But when I deserialize, my roles are gone:
deserialized = JSON.parse(usersJSON);
log(deserialized[0].role); // undefined
JSON.stringify is refusing to convert my Symbol values to strings! Don't worry, with a little trickery, we can get around this limitation:
function serializeWithRoles(users){
return JSON.stringify(
users.map(user => {
// convert the role Symbols to strings so they serialize
user.role = user.role.description;
return user;
})
)
}
function deserializeWithRoles(userJSON){
return JSON.parse(userJSON)
.map(user => {
// convert role strings back to symbols
user.role = Symbol(user.role);
return user;
});
}
Let's try it:
const usersJSON = serializeWithRoles(users);
const deserialized = deserializeWithRoles(usersJSON);
log(deserialized[0].role); // Symbol(admin)
log(deserialized[0].role.description); // "admin"
Et voilà! Serializing & deserializing with our roles "works", and we have Symbols at the finish, just as we did at the start.
This is bad because it breaks a major feature of Symbols: the fact that they're unique. The proper way to use a Symbol defined elsewhere is to import the that Symbol and use it directly. If it's not not exported, it probably is not meant to be used externally. If it is meant to be used externally but was not exported, that's a bug.
If you don't care about using the exact same copy of a Symbol object property or Symbol value, or you want to define such values in multiple places and compare them, a string is probably more appropriate. If you want to use the same Symbol but access it from multiple places using the description, use Symbol.for (note the caveats about namespacing this type of Symbol!).
The fact that the built-in JSON.stringify method refuses to convert Symbols to a string (JSON) representation gives us a hint that doing this is probably not a good idea. In fact, it's impossible to convert a Symbol into a string and then back into the same Symbol because a) the Symbol exists uniquely only within the context of a running application and b) while the Symbol description may be a string which can be serialized (as we did above), the description is not the symbol.

Attempting to serialize and deserialize Symbols, which exist only in the context of a running application, cannot work. In our example above, while the admin Symbol is "serialized" by description string then deserialized by passing the string to Symbol(), each of the Symbols created in the deserialization is unique. This means that while users[0].role === users[1].role was true before serializing & deserializing, it is false after. You could use Symbol.for to get around this, but at that point the Symbol is no more reliable or unique that its description, in which case why not just use the description.
When I read of the introduction of Symbol.prototype.description, the antipatterns it would make easier were the first thing that came to mind. I am sure both of the methods I describe above will exist in the wild soon, so when you come across one of them remember: you heard it here first!
1 If you do want to learn more about the uses of Symbols, see this informative article. ⤴
2 With the exception of global symbols find-or-create'ed using Symbol.for, but these will never have the same value as a Symbol created using Symbol(). ⤴
In the course of developing a feature, you might notice a library that needs upgrading, which requires some minor refactors, some repeated code to extract into functions, a small change that could result in a performance improvement, and a dozen other issues. If you attempt to address them all in the moment, two things will happen:
The following strategies can help you stay on track with the task at hand and keep your pull requests manageable while ensuring you don't lose track of the issues you uncovered in the course of development.
Upon starting a task, make a list for “To Do Later” actions. When, in the course of working on Issue-A, you come across something that should be improved but a) will take time and b) is not necessary to complete Issue-A, put this item on the “To Do Later” list.
After completing Issue-A, go over your “To Do Later” items and do them (small items) or create tickets for them (larger items), as appropriate. Noting items for later allows you to stay focused on your current task without losing track of the improvement you’d like to make.
This idea was adapted from the “Parking Lot” concept for meetings.
Upon starting a coding task (“Issue-A”), make a numbered “Improvements” list (1,2,3) for code improvements. When you find a small issue you’d like to address (outside the ticket scope), fix it, and add it to the “Improvements” list as item one. Do this again for the second and third small issues you fix, then stop.
Once you’ve made three small improvements, your “Improvement Budget” has been spent, and no more out-of-scope improvements should be worked on as part of Issue-A. Any additional out-of-scope issues must be put on the “To Do Later” list.
This strategy is a compromise between “focus only on the issue at hand and don’t improve anything” and “fix everything issue you find, as you find it, even if this means Issue-A takes weeks to complete rather than days.” The “budget” can of course be adjusted to a number of items besides 3.
This idea was inspired by the Most Important Tasks strategy, which also has the concept of a budget of three items.
Do you use these strategies, and if so are they useful to you? Do you have other strategies to balance code improvement and feature delivery? If so please let me know in the comment box below. Happy coding!
]]>And what cases are those, pray tell?
Read on and find out!
In short, it makes sense to use a framework for:
If you're making a website for a business idea, using Rails & Bootstrap can get you up and running with layouts, login/auth, routing, admin screens, callout boxes, modals, etc. in less than a day, provided you're familiar with the tools. This is pretty amazing!
A friend was building a site with Drupal and he wanted to implement search with Solr. There was a Drupal plugin to do it, so he used that plugin. He was a skilled developer and could have written a CMS from scratch and implemented Solr search to boot, but this would have been a waste of time because someone had already done it. He wasn't using a framework out of ignorance or inability to complete the task without one, but out of expediency.
Likewise, writing basic HTTP routing logic and query string parsing is not a herculean task, but if someone has done it already, why reinvent the wheel? A key component to this rationale, however, is that you could do the task by hand if you wanted to. We'll discuss why this is important later...
Sometimes...
This is a great time to use a framework!
If you're a python developer and you want a nicely layed out website that works well on mobile phones and looks professional, you can get there with Bootstrap without spending time learning a lot of HTML, CSS, and other browser technologies. A framework is a very powerful tool in this case, extending your ability to create things far beyond your realm of expertise.
But you should learn the underlying technologies!!
- Strawman Who Hates Frameworks
If writing HTML based user interfaces is a core skill of yours (or you wish it to be), yes, you should learn the underlying technologies. But if it's not a core skill and you don't wish it to be, then learning how flexbox works is a waste of your time. It is best to focus your time on those things you _do_ wish to be an expert in, and use out-of-the-box solutions for the rest.
There are at least four situations where it's not appropriate to use a framework:
This is the counterpoint to the "use a framework if you don't care to learn the underlying technologies" argument. If you _do_ want to learn and develop expertise in the underlying technologies, a framework is not a good way to start, in my opinion.
For example, if you are just getting started with web programming, and want to become a professional JavaScript programmer, do not start out by using Angular or React/Redux/Webpack!! These tools assume a high degree of familiarity with JavaScript. They are built for professionals to speed development and scaling of complex applications. They are not built to help beginners learn JavaScript & HTML.
Starting your learning journey with a big framework has many disadvantages:
Instead of starting with a framework, just start with HTML, JavaScript, and a tab pinned to https://devdocs.io/. Try stuff out! Read the docs! Don't be afraid to write "bad" code–doing so is essential to learning.
When certain tasks become tedious, you'll know it's time to pull in a library. Eventually you'll get to a point where you say "gee, I wish there were an easier way to do X", for example, create HTTP requests. At that point you pull in a library to do that task. Whereas a framework gives you a full toolbox and a set of instructions, writing by hand and pulling in libraries as needed will help you understand why it is useful to use that tool., which is a crucial to programming effectively, with or without a framework!
If all you know is React/Webpack, you will struggle to solve problems that React was not designed to solve. Ideally, you should analyze a business problem first, then decide what the best tool is to solve that problem. If all you have is one tool, you are not capable of doing this.
Having only one tool that you know frequently leads to the next two framework-use-antipatterns...
Imagine you have a bunch of IoT thermometers, and they need a server to periodically send data to, which will write that data to a CSV file. This server needs exactly 1 endpoint: record_temperature.
If all you know is Ruby on Rails, you will probably create a new Rails app with a database, a complex & powerful ORM, models, controllers, flexible authentication options, admin routes, json marshalling, HTML templating, and dozens of other features. This is overkill! Furthermore, the tool isn't even built to do what you need it to do (Rails is designed to work on a database, not a single CSV file). If you learned "Ruby" to start, rather than "Ruby on Rails", you would be able to easily build a tiny server, probably with one single file and zero dependencies, and this is guaranteed to be cheaper to run and easier to maintain.
Once, for a coding test, my employer asked engineering job candidates to build a sample application that took text as input (from the command line), did some processing, and output some other text. The candidate could choose whatever language they were most comfortable with. A typical solution might contain two or three source files of Java, Python, or JavaScript (Node.js).
I was reviewing one candidate's submission, and found a half dozen directories, config files for eslint, vscode and webpack, several web-font files, an image optimization pipeline, all of React.js and far more.

The candidate had clearly learned to use create-react-app to start projects, and had learned no other way. That lead them to submit a solution one hundredfold more complex than was needed, and that didn't meet the requirements–we didn't ask ask for a web app! This is an extreme case but it illustrates the fact that if you only know one tool, you will invariably attempt to use it to solve problems it's not well suited for.
Programming frameworks can be useful tools, but they can only be deployed appropriately if you've learned enough to be able to pick the right tool for the job. To learn this skill, you must first learn to work without frameworks.
Put another way, the best way to ensure you use frameworks properly (as a beginner) is to not use them at all. Does that make sense?
I remember that my first contact with the world of web apps was using Ruby on Rails. It surely felt like magic and it was amazing working with the framework, but years later I started struggling to understand simple HTTP requests and MVC concepts. Therefore, I couldn't have put it better: the better way to start learning how to build professional web apps is to start with just a piece of wood, a hammer and a nail - but not with an IKEA box with a book containing complicated instructions. Thanks Sequoia!
Thank you for the generous feedback, and I'm glad to hear this post reflects your experience well. I like your "Ikea" analogy–I kept struggling for analogy around a gas-powered ditch digger vs. a shovel, but Ikea furniture is much better. Cheers!
]]>Great blog post! If I could contribute one thing to this article it would be that with new frameworks hooking up a debugger and watching the entire stack from the beginning of a request to the end is a tremendous learning experience. This is a great way to get exposed to more complex topics when you are at a more junior level.
"Rabbi, can you believe how stuck up and unfriendly these Programming Élites are?" The User asks. "I am struggling to keep my head above water with all the new frameworks and tools coming out daily, and the documenation for these libraries are terrible to non-existent, but when I post a polite question on Github about about a problem I'm having, someone closes my issue, tells me to learn programming basics & says I should read the source and improve the docs myself!
"I'm trying to understand the thing, how am I supposed to write the docs? And what an insulting thing to say, 'go read a book on programming.' Isn't this a terrible way to treat a beginner trying to ask a question & learn?"
The rabbi considers the matter for a few moments, then responds: "You're right."
The User leaves satisfied, but the rabbi is approached by second person. "Rabbi," says The Maintainer, "I overheard your conversation, and I want to tell you different story.
"I maintain a very popular library, for free, and every day I get feature demands, people getting angry at me, people expecting to be spoon-fed answers without reading the documenation or the source code, and thinking someone else is going to do the work of fixing bugs, writing documenation, creating new features, and all the other work of maintaining an open source library, when in fact it's their job just as much as mine. Don't you agree these users are horribly entitled?"
The rabbi thinks for a moment then looks at The Maintainer and says "You're right." The Maintainer, satisfied he's won the rabbi to his way of thinking, walks off towards the speaker lounge.
A vendor, having overheard both exchanges, calls the rabbi over to his booth. "Rabbi, you just told The User he was right, but then you turned around and told The Maintainer he was right—they can't both be right!
The rabbi thinks a minute. "You're right!"
I drafted this post in January of 2016 and didn't get around to posting it for a few years. If you'd like it to be topical for 2018, pretend one party is the developer of a popular open source project who's frustrated about people making money off the software and not contributing, and the other is a SAAS vendor who feels that if the software is free it's free, and the developer needs to live with that choice, or any other two parties who can't both be right.
]]>Conferences are usually catered, but many catering services, well, leave something to be desired. In particular, breakfasts can be rough: pastries, fruit, more pastries... you get the idea. Eating a healthy breakfast that makes you feel good is more than just a luxury when travelling: it can be the difference between having a good morning and a crummy one. Take care of your body at meal-time and it will take care of you later!
When I arrive in a new town for a conference that I will be spending more that one night at, immediately after checking into the hotel I head to the grocery store for:
Plane travel, being in a new place, eating out three times a day—this is all hard on your body! Not to mention your wallet and brain. Sometimes sitting quietly and eating a simple meal beats another hour of socializing over beers & burgers. That's where the fruit, bread, and cheese come in: when you need food and a quiet break, you have a meal ready in your hotel room.
To summarize:
The first couple conferences I went to, I had my whole day planned out: this talk at 9:00am, that one at 10:00, then 11:00, 12:00, 1:00, 2:00... this is crazy! Remember: there's a limit to how much you can take in. If you do attend eight talks in a day, it's unlikely you'll be able to really focus on what you're hearing at all of them. Rather than power through eight talks in a day, pick four and give each of them your full attention—you'll get more out of it overall.
The FOMO is strong, but no matter how hard you try I guarantee you'll be missing something. So take my advice and stop worrying about it! Rather than stressing about everything you "missed," you'll have a much better time if you set realistic expectations and give yourself breaks. So:
What to do when you're not in talks?
It's tempting to google the topic being presented, check twitter, email, try to run the code examples etc.. You do not need to travel to a conference to do this. As such, it is (in my opinion) a poor use of conference time. If you're watching a talk, tune in and pay attention! If the talk is really so uninteresting that you don't feel the need to pay attention, why stay? There are usually better places to sit and work than an auditorium seat.
Taking notes is a great way to remember what you heard, note things you want to look up later, and to capture follow-up questions. So take out your laptop and fire up Evernote, right? Wrong! When you get out your computer, it's really hard to stick to just taking notes. When you're taking notes on your computer, and you have a question, and the talk is a bit slow, and it will only take just a second to find the answer on google...
Carrying a pen or pencil and a notebook is a great strategy for those of us who are easily distracted by the wide internet. If you've never tried this remove-the-temptation strategy, you may be surprised how much more you get you get out of a talk when you give it 100% of your attention. Paper notes may help you retain the information better, and they're useful whilst chatting at the after-party: peek at your notebook to easily recall insights and questions. (Bonus: it makes you look super organized! 😄)
I used to feel like I had to "commit to" a session I was watching, or that if I missed the beginning of a session it was pointless to join late. Not true! If you get five minutes into watching a talk and realize it's not what you expected or you just change your mind, quietly slip out & try another talk.
I'll admit that as a speaker, it's not my favorite thing to see people walking out of a talk I'm giving. However, I know that you're at the conference to learn new things, not to flatter conference speakers. You don't owe it to anyone to sit through a talk. Furthermore, if my talk is bad and everyone sits through it just to be polite, I'll never know it's bad. Do speakers the courtesy of giving them honest feedback: if a talk is bad, don't sit through it, be honest and walk out! You'll be doing yourself and the speaker a favor.
This one really comes down to personal preference, but I've found that sticking around for the conference closing party or staying an extra day to visit the host city is rarely worth the extra time in the hotel. I used to extend trips a night or two for this reason, but I am usually so exhausted after a conference it's hard to enjoy being a tourist, and I've found that the value of sleeping in my own bed sooner almost always outweighs the value of another evening of schmoozing.
There is definitely a point of diminishing returns at the end of a conference: the crowd starts to thin out, the vendors pack up, and eventually there's naught left but a lone nerd, tinkering with a new framework at an empty buffet table, or wandering the vendor floor aimlessly with her sponsor shirt and enormous backpack. It's a bit depressing, frankly. 😛 Don't feel the need to stay 'til the bitter end!
These days I try to get a flight out as close to after-closing-ceremonies as possible, or a bit before if the other option is staying an extra night.
You've probably heard this old chestnut, but it bears repeating: there's a lot more value in conferences than what you get from attending talks. Networking, trading tips, finding job leads, and making new friends: these are all things you'll find on "the hallway track," i.e. by hanging out in the hallway, chatting with your peers. Conferences are the best venue for networking (read: "finding work") I've found, but you won't access this value if you're in talks all day. To get the most out of your experience, be sure to make time for the hallway track!
I hope at least one of these tips has been useful to you! If you have feedback or can think of one of the many points I missed here, please do send a comment and I'll add it below. Happy conferencing!
]]>When we split authentication off from a "monolith" application, we have two challenges to contend with:
For the purposes of demonstrating session sharing, we'll be creating two simple servers: writer, our "auth" server that sets and modifies sessions, and reader, our "application" server that checks login and reads sessions. Code for this demo can be found here: https://github.com/Sequoia/sharing-cookies.
NB: You may be thinking "let's use JWTs! They are stateless and circumvent the cookie sharing issue completely." Using JWTs to reimplement sessions is a bad idea for various reasons, so we won't be doing it here
In order to share sessions across servers, we'll use an external redis server to store session info. I'm using a free redis instance from https://redislabs.com/ for this demo.
Here we set up an express server with redis-based session tracking and run our server on port 8090.
// writer/index.js
const express = require('express');
const session = require('express-session');
const RedisStore = require('connect-redis')(session);
const app = express();
const redisOptions = {
url : process.env.REDIS_SESSION_URL
}
const sessionOptions = {
store: new RedisStore(redisOptions),
secret: process.env.SESSION_SECRET,
logErrors: true,
unset: 'destroy'
}
app.use(session(sessionOptions));
app.listen(8090, function(){
console.log('WRITE server listening');
});
Our application relies on REDIS_SESSION_URL and SESSION_SECRET being available as environment variables. These are externalized both for security and to allow us to share these values across different application instances.
For our demo, our express-based auth server will have three paths:
/login: set a user session.
app.get('/login', function(req, res){
// .. insert auth logic here .. //
if(!req.session.user){
req.session.user = {
id : Math.random()
};
}
res.json({
message : 'you are now logged in',
user : req.session.user
});
});
/increment: increment a counter on the session (update session data)app.get('/increment', function incrementCounter(req, res){
if(req.session.count){
req.session.count++;
}else{
req.session.count = 1;
}
res.json({
message : 'Incremented Count',
count: req.session.count
});
});
/logout: destroy a sessionapp.get('/logout', function destroySession(req, res){
if(req.session){
req.session.destroy(function done(){
res.json({
message: 'logged out : count reset'
});
});
}
});
Our server is set up to run via npm start in our package.json file:
...
"scripts" : {
"start" : "node index.js"
}
...
We start by running npm run with the appropriate environment variables set. There are many ways to set environment variables, here we will simply pass them at startup time:
$ REDIS_SESSION_URL=redis://hostname:port?password=s3cr3t SESSION_SECRET='abc123' npm start
Now, assuming redis connected properly, we can start testing our URLS
GET localhost:8090/login:
{
"message": "you are now logged in",
"user": {
"id": 0.36535326065695717
}
}
GET localhost:8090/increment
{
"message": "Incremented Count",
"count": 1
}
It works! To verify that the session is independent of the server instance, you can try shutting down the server, restarting it, and checking that your user.id and count remain intact.
We can see our sessions in redis by connecting with the redis-cli:
$ redis-cli -h <host> -p <port> -a <password>
host:43798> keys *
1) "sess:q5t7q67lzOsCJDca-kvT63Yk6n6kVvpL"
host:43798> get "sess:q5t7q67lzOsCJDca-kvT63Yk6n6kVvpL"
"{\"cookie\":{\"originalMaxAge\":null,\"expires\":null,\"httpOnly\":true,\"path\":\"/\"},\"user\":{\"id\":0.36535326065695717},\"count\":1}"
The application (reader) server has one single path:
/: read current count.The server setup code is the same as above, with the exception that our server is run on 8080 rather than 8090 so we can run both locally at the same time.
In order to ensure users who hit our "application" server have logged in, we'll add a middleware that checks that the session is set and it has a user key:
// reader/index.js
app.use(function checkSession(req, res, next){
if(!req.session.user){
//alternately: res.redirect('/login')
return res.json(403, {
'message' : 'Please go "log in!" (set up your session)',
'login': '/login'
});
}else{
next();
}
});
Then we'll add our single route:
// reader/index.js
app.get('/', function displayCount(req, res){
res.json({
user : req.session.user,
count: req.session.count
})
});
Start this server as we started the other:
npm runNow we can check that it works:
GET localhost:8080
{
"user": {
"id": 0.36535326065695717
},
"count": 1
}
Try it from a private tab or different browser, where we haven't yet logged in:
GET localhost:8080
{
"message": "Please go \"log in!\" (set up your session)",
"login":"/login"
}
It works!
In fact, browsers do not take port number into consideration when determining what the host is and what cookies belong to that host! This means that we can run our auth server locally on :8090 and the app server on :8080 and they can share cookies, as long as we use the hostname localhost for both!
This works fine locally, now let's see it in The Cloud. We'll be using https://zeit.co/now for hosting. now is microservice oriented hosting platform that allows us to easily deploy Node.js applications and compose application instances to work together, so it's a great choice for this demo!
now expects node.js applications to start with npm start, luckily we've already configured our application to do that, so all that's left to do is to deploy it!
$ cd writer
$ now # missing environment variables...
> Deploying ~/projects/demos/sharing-cookies/writer under sequoia
> Using Node.js 7.10.0 (default)
> Ready! https://writer-xyz.now.sh (copied to clipboard) [1s]
> Synced 2 files (1.19kB) [2s]
> Initializing…
> Building
...
This will deploy our application to now, but it won't actually work, because the application will not have the environment variables it needs. We can fix this by putting the environment variables in a file called .env (that we do not check in to git!!!) and passing that file as a parameter to now. It will read the file and load those variables into the environment of our deployment.
# .env
REDIS_SESSION_URL="your redis url here"
SESSION_SECRET="abc123"
$ echo '.env' >> ../.gitignore # important!!
$ now --dotenv=../.env
> Deploying ~/projects/demos/sharing-cookies/writer under sequoia
> Using Node.js 7.10.0 (default)
> Ready! https://writer-gkdldldejq.now.sh (copied to clipboard) [1s]
> Synced 2 files (1.19kB) [2s]
> Initializing…
> Building
Once the command finishes, we can load that URL in our browser:
GET https://writer-gkdldldejq.now.sh/login
{
"message": "you are now logged in",
"user": {
"id": 0.31483764592524177
}
}
reader)We repeat the above steps in our /reader directory, passing the same .env file to now --dotenv...
$ cd ../reader
$ now --dotenv=../.env
> Deploying ~/projects/demos/sharing-cookies/reader under sequoia
> Using Node.js 7.10.0 (default)
> Ready! reader-irdrsmayqv.now.sh (copied to clipboard) [1s]
> Synced 2 files (1.19kB) [2s]
> Initializing…
> Building
...
Once it's done we check via our browser...
GET https://reader-irdrsmayqv.now.sh
{
"message": "Please go \"log in!\" (set up your session)",
"login": "/login"
}
We're not logged in! What happened?
We noted above that in order to share sessions, we needed to share two things:
Because our servers run on different domains now, we're not sharing cookies. We'll fix that with a simple reverse-proxy set up now refers to as "aliases."
We want both of our applications running on the same domain so they can share cookies (as well as other reasons including avoiding extra DNS lookups and obviating the need for CORS headers). now allows aliasing to any arbitrary subdomain under now.sh, and I've chosen counter-demo.now.sh for this post.
We want routing to work as follows:
/: application server (https://reader-irdrsmayqv.now.sh/)login, increment, logout: "auth" server (https://writer-gkdldldejq.now.sh/)To configure multiple forwarding rules for one "alias" (domain), we'll first define them in a json file:
{
"rules" : [
{ "pathname" : "/login", "dest" : "writer-gkdldldejq.now.sh" },
{ "pathname" : "/increment", "dest" : "writer-gkdldldejq.now.sh" },
{ "pathname" : "/logout", "dest" : "writer-gkdldldejq.now.sh" },
{ "dest" : "reader-irdrsmayqv.now.sh" }
]
}
We pass these to now alias using the --rules switch, along with our desired subdomain:
$ now alias counter-demo.now.sh --rules=./now-aliases.json
> Success! 3 rules configured for counter-demo.now.sh [1s]
Now to try it out:

It works! Two servers running two separate applications, each sharing sessions and cookies.
This is a rudimentary reverse proxy set up, but with this in place we can...
now scale reader-irdrsmayqv.now.sh 2) without breaking our session management systemNow go try it out! https://github.com/Sequoia/sharing-cookies
Really nice article, but I think the last part (aliases) should be longer and more in-depth. The current implementation is dependent on now.sh's particular feature and the actual mechanism isn't detailed. It would be great if you provided more implementations for the aliasing with different servers (like apache or nginx), so we could build a production environment without using now.sh. What do you think?
- Semmu
Thanks, Semmu! It's true, the approach described here is dependant on now.sh's aliasing feature, and yes, there are certainly other ways to do it! I featured now.sh here in part because it is very simple to use and explain. An explanation of how to tie this together with nginx (I'd pick it over Apache for this use-case) would be useful! I don't have such an explanation on hand but I'll try to write a blog post in the future describing reverse proxying with nginx. Thanks for the comment!
Hi Sequoia, Great article! A downside I see from sharing the same session storage is the coupling between the services. In your example, if someone decides to use a different web framework (like Rails) or even a different version of express js, the session format created by the services might not be compatible anymore. In other words, we would be giving up on the tech-agnostic benefit that microservices are supposed to provide. I see two possible solutions to this problem:
- Make the session format standard across all the microservices and implement the standard in libraries for each language (instead of using express-session)
- Have every microservice use a sidecar container that writes and reads sessions from the shared session storage. This sidecar container will return the session in a standard JSON format.
Please, let me know if you see other solutions or if I have any faulty assumption on my analysis. Thanks,
Arturo
- Arturo
Thank you Arturo for the thoughtful feedback! I would agree that any time that you share any data between systems, each system will need to be designed to accommodate that format of data, and sessions is no exception. For this example of sharing session data, I think creating a "standard" format for session data would be overkill, as the concept is the same regardless of the format of the session data or the specific tools used in each service.
Even if this were production, I would use out-of-the-box Express session to start and keep the system as simple as possible. I would consider making a cross-framework session format only at the point where that became an actual requirement, and not a minute sooner! At that point, I'd tweak the easiest-to-tweak system to fit the format of the other one. Only once there were three or four different systems that all needed to share sessions would I consider a system as complex as a sidecar container (which, incidentally, would force you off the Node deploys on Now.sh and onto Docker deploys).
Thank you again for your well-considered feedback, and don't forget to Keep It Simple!
]]>In this post I'll outline my proposals for addressing these issues.
minimum-proposal-stageThis idea is lifted from Composer, which has minimum-stability property for projects. My idea is as follows:
npm install warns if you are installing a library with features newer than desiredFor example, if you only want "finished" (Stage 4) or higher features in your project, you add the following to your package.json:
"minimum-proposal-stage" : 4
Aurelia would indicate that it incorporates a Stage 2 proposed/experimental feature (decorators) by adding the following to its package.json files:
"lowest-proposal-stage" : 2
Upon attempting to install Aurelia, npm would warn you that the library's lowest-proposal-stage is lower than your minimum-proposal-stage. Basically: "hey! You're about to install a library with language features more experimental than you might be comfortable with!"
lowest-proposal-stage property as features are adopted into the languagemaximum-ecmascript-versionThis is like above, but pegged to ECMAScript versions.
Example: in my project, I don't want code newer than ES7 (the current released standard at the time of this writing), i.e. I don't want unreleased features:
"maximum-ecmascript-version" : 7
In the library's package file, they indicate that the library incorporates features that do not exist in any current version of ECMAScript:
"ecmascript-version" : "experimental"
npm would warn me before installing this package. This one, on the other hand, would install without complaint:
"ecmascript-version" : 5
because the ecmascript-version is lower than my maximum.
ecmascript-version if it's set to a released versionThe two systems could also be used in conjunction with one another; I won't go into that possibility here.
Add badges to README.md files to indicate whether experimental features are used in the library. Here are some sample badges that use the proposal stage names rather than numbers:
(Please excuse the largeness of these badges)
Alternately, the language version could be used:
Change is good, but stability is also good. Everyone should be able to easily choose to use or not use the latest and greatest JavaScript features and proposed features. Increasing visibility into experimental feature dependencies will...
Please let me know what you think with a comment (below) or on hackernews.
]]>let n = { x, y, ...z };Promise.resolve(123).then(::console.log);Promise.resolve(2).finally(() => {})@observable title = "";'[1...10]'.forEach(num => console.log(num))If you said "none of these are 'just JavaScript'," you were right! The first four are proposed features. Number five is a feature from another language, but you can use it with babel!
In order for new features to land in the ECMAScript specification, they must go through several proposal stages, as described here. The difference between JS and most other ecosystems is that in most ecosystems, language features must exist in the specification before they are incorporated into userland code. Not so JavaScript! With babel, you can start using Stage 3 ("Candidate"), Stage 1 ("Proposal"), or even Stage 0 ("Strawman") features in production right away, before they are finalized.
What does it mean for a feature proposal to be at Stage 2 ("Draft")? According to the TC39, it means the feature implementations are "experimental," and "incremental" changes to the feature can be expected. Basically, this means the behavior of the proposed feature may change before the feature is finalized.
This is great for those who want to live on the edge, but what about those of us who must prioritize stability over bleeding-edgeness? Can we just stick to finalized features and avoid experimental ones? It is possible, but it's not as simple as you might expect...
The fuzzy boundary between what "is JavaScript" and what are "JavaScript feature proposals" creates a lot of ambiguity and confusion. It's common to mistakenly refer to any "new JavaScript feature" as ES6, ES7, ES.Next or ES2016, more or less interchangeably. Unfortunately, authors of many popular JavaScript libraries do just this, exacerbating the misunderstanding. I'll pick on two lovely library authors here because they are very cool people & I'm sure they know I don't mean it as a personal criticism. ^_^
I recently found myself looking into new JavaScript libraries and I encountered some syntax I was not familiar with in JavaScript:
class Todo {
id = Math.random();
@observable title = "";
@observable finished = false;
}
@observable? Huh! That looks like an annotation from Java. I didn't know those existed in the current language specification. It took looking it up to find out that it is in fact not JavaScript as currently specified, but a proposed feature. (In fairness, Mobx does explain that this feature is "ES.Next", but that term is vaguely defined and often used to refer to ES6 or ES7 features as well.)
From the website (emphasis added):
What is it?
Well, it's actually simple. Aurelia is just JavaScript. However, it's not yesterday's JavaScript, but the JavaScript of tomorrow. By using modern tooling we've been able to write Aurelia from the ground up in ECMAScript 2016. This means we have native modules, classes, decorators and more at our disposal...and you have them too.
Well, now we know: decorators were added to JavaScript in the ES2016 language specification. Just one problem... no they weren't!!! Decorators are still a Stage 2 feature proposal. Aurelia is not "just JavaScript," it's "JavaScript [plus some collection of experimental language features]"
This matters because as anyone involved the JavaScript ecosystem these days knows, "it's hard to keep up with all the latest developments" is probably the #1 complaint about the ecosystem. This causes users to throw their hands up, exasperated, and it causes enterprise organizations to avoid JavaScript altogether. Why invest in a platform where it's difficult to even ascertain what the boundaries of the language are?
Also, as mentioned above, these features are not officially stable. This means that if you write code depending on the current (proposed) version of the feature, that code may stop working when the feature is finalized. While you may consider this an acceptable risk, I assure you there are many users and organizations that do not. Currently, making an informed decision to opt-in to using these experimental features is difficult and requires a high level of expertise—users must be able to identify each new feature & manually check where it is in the proposal or release phase. This is especially challenging for organizations for whom JavaScript is not a core-competency.
Finally, (this is my own opinion) it's just plain annoying to constantly encounter unfamiliar language syntax and be left wondering "Is this JavaScript? Is this Typescript? Is this JSX? Is this..." I don't want to have to google "javascript ::" to figure out what the heck that new syntax is and whether it's current JavaScript, a feature proposal, a super-lang, or Just Some Random Thing Someone Wrote a Babel Transform For.
This probably does not matter if the exposed interfaces do not use or require the use of experimental language features. A library could be written in JavaScript, Coffescript or Typescript as long as the dist/ is plain JavaScript. Annotations are an example of an experimental feature that some libraries encourage or require the use of in user code. Further, some libraries do not distribute a build artifact, instead directing users to npm install the source and build locally. In these cases, there is the potential for breakage if draft specifications of experimental features change, and warning users of this is warranted (IMO).
No! By all means, use them! All I'm saying is that it would be useful to be able to make an informed choice to opt-in to using experimental features. That way, organizations that prefer stability can say "no thank you" and users who want to be on the bleeding edge can keep bleeding just as they're doing today.
Composer has a mechanism to allow users to allow or disallow unstable versions of dependencies from being installed and it does not prevent people from using unstable releases, it merely gives them the choice to op-in or out.
An added benefit of increasing visibility into experimental feature use would be to help users understand the TC39 process. Currently there is not enough understanding of what it means for something to be ES6, or ES7, or Proposal Stage 2, as evidenced by the way these terms are thrown around willy-nilly.
In my next post I'll go over my proposals for addressing this issue.
Thank you for this post. You are doing the Lord's work
Thanks Uzo! I really like your post on the subject as well, especially this point: "On a language level, overlap is a problem. If there’s more than 3 ways of doing a thing, I must know all three and everyone I’m working with must know all three."
]]>If our goal is to map a sequence of events over time (file creation or modification) to one or more operations (building Markdown to HTML and writing to disk), it's very likely Observables are a good fit! In this blog post, we'll look at how to use Observables and RX.js to create a SSG with built-in, incremental watch rebuilds, and with with multiple output streams (individual posts and blog index page).
This post loosely follows the demo project here, so if you prefer to look at all the code at once (or run it) you can do so.
Observables can be confusing so reading a more detailed intro is advisable. Here I'll give a simplified explanation of Observables that is inadequate to understand them fully, but will hopefully be enough for this blog post!
As alluded to, Observables are a tool for modeling and working with events over time. One way to conceptualize Observables if you are familiar with Promises is to think of an Observable as a Promise that can emit multiple values. A Promise has the following "things it can do:"
The things an Observable can do are:
reject)The key difference is that that emitting a value and "settling" (called "completing" in RX.js Observables) are split into separate actions in Observables, and because emitting a value does not "complete" an Observable, it can be done over and over.
To illustrate this comparison further, with code, let us imagine a new utility method for constructing Promises called Promise.create. It behaves the same as the Promise constructor, but the signature of its function argument is slightly different.
// Promise constructor
const p1 = new Promise(function settle(resolve, reject){
if(foo){ resolve(value); }
else{ reject('Error!'); }
});
p1.then(console.log);
// Promise.create (imaginary API)
const p2 = Promise.create(function settle(settler){
if(foo){ settler.resolve(value); }
else{ settler.reject('Error!'); }
});
p2.then(console.log);
As you can see, Promise.create takes a settle function which receives an object with resolve and reject methods, rather than separate resolve and reject functions. From here it is a short step to Rx.Observable.create:
const o = Rx.Observable.create(function subscribe(subscriber) {
try{
while(let foo = getNextFoo()){
subscriber.next(foo); // emit next value
}
subscriber.complete(); // emit "complete"
}
catch(e){
subscriber.error(e); // emit "error"
}
});
When you want to use the results of a Promise, you attach a function via .then. With an Observable, when you wish to use the results to produce side effects, you attach an "Observer" via .subscribe:
myPromise.then(foo => console.log('foo is %s', foo))
.catch(e => console.error(e));
myObservable.subscribe({
next: foo => console.log('next foo is %s', foo),
error: e => console.error(e),
complete: () => console.log('All done! No more foo.')
})
In the style of programming with Observables that RX.js allows, tasks are often conceptualized as being composed of two parts: Observables (inputs), and Subscriptions or side effects (outputs). Side effects describe what you want to ultimately do with the values from an observable.
If you describe your goals as side effects, you can work backward to figure out what sort of Observables you need to create to provide the values those side effects need. For example, if you wish a counter to be incremented each time a button is clicked, "increment the counter" is the side effect, and you need an Observable of button clicks for that increment function to "subscribe" to.
For our static site generator, we have the following high-level goals:
The Observables we can map to each of these goals are:
parsedAndRenderedPosts$: emits post output each time a post is created or changed. Subscribe to this and write the new post contents to disk on each emit.latestPostMetadata$: emits a collection of the latest metadata when the script starts or the metadata for a post changes. Subscribe to this and write the rendered index page to disk on each emit.Each of these two Observables will be composed of or created as a result of other Observables. As we build these up, we'll learn about different methods Rx.js has for creating and transforming Observables. Let's begin!
We know each of our Observables should emit based on file changes and additions, so we'll start by creating an Observable of file changes and additions called changesAndAdditions$. The chokidar module can be used to create an event emitter that emits change and add events on filesystem changes, so let's start there:
const chokidar = require('chokidar');
const dirWatcher = chokidar.watch('./_posts/*.md');
We want to create Observables of file changes & additions so we can manipulate & combine them with Rx.js. Rx.js provides a utility method to create an Observable from EventEmitter by event name. We are interested in the add event and the change events, so let's use fromEvent to create Observables of them:
const Rx = require('rxjs');
// note: `add` is emitted for each file on startup, when chokidar first scans the directory
const newFiles$ = Rx.Observable.fromEvent(dirWatcher, 'add');
const changedFiles$ = Rx.Observable.fromEvent(dirWatcher, 'change');
Now newFiles$ will emit a new value (a filename) when dirWatcher emits an add event, and changedFiles$ behaves similarly with change events. We can create an observable of both of these events by using .merge.
const changesAndAdditions$ = newFiles$.merge(changedFiles$);
To get the file contents, we can .map the name of the file to the contents of that file by using a function that reads files. Reading a file is (typically) an asynchronous operation. If we were using Promises, we might write a function that takes a filename and returns a Promise that will emit the file contents. Similarly, using Observables, we use a function that takes a filename and returns an Observable that will emit the file's contents.
Just as Promise.promisify will convert a callback based function to one that returns a Promise, Rx.Observable.bindNodeCallback converts a callback based function to one that returns an Observable:
const fs = require('fs');
const readFileAsObservable = Rx.Observable.bindNodeCallback(fs.readFile);
const fileContents$ = changesAndAdditions$
.map(readFileAsObservable) // map filename observable of file contents
.mergeAll(); // Unwrap Observable<"file contents"> to get "file contents"
fileContents$
.subscribe(content => console.log(content)); // log contents of each file
Now we'll log the contents of each file as it is created or changed. Let's take a closer look at our use of .mergeAll: readFileAsObservable is a function that takes a String (filename) as input and returns an Observable<String> (an observable of the "file contents" string).
This means that by mapping changesAndAdditions$ over readFileAsObservable, we took an Observable<String> (an observable of strings, namely, file names) and converted each String value to a new Observable<String>. This means we have Observable<Observable<String>>: an Observable of Observables of Strings.
We don't actually want an Observable of file contents, we want filename in, file contents out. For this reason we use .mergeAll to "unwrap" the file contents strings from the inner Observables as they are emitted. If you are confused by this, don't worry: it is in fact confusing! For now it's only important to understand that .mergeAll converts Observable<Observable<String>> to Observable<String>, so we can process the string (in this case file contents).
NB: Mapping a value to an observable then unwrapping that inner observable as we’ve done here is an extremely common operation in Rx.js, and can be achieved using the .mergeMap(fn) shorthand, which is the equivalent of .map(fn).mergeAll().
When our script starts, newFiles$ will emit each filename once when chokidar first scans our _posts directory, and this will be merged into changesAndAdditions$. While editing a post in your text editor, each time you "save" the Markdown file, changedFiles$ will emit the filename, regardless of whether the contents of the file actually changed. If you hit ^S ten times in a row, changesAndAdditions$ will emit that filename 10 times and we'll read the file 10 times.
If the file contents hasn't changed, we don't want to send it down the pipe to be parsed, templated, and written as an updated HTML file-- we only want to do this latter processing (right now just console.log(contents)) if the contents are actually different from the last contents that were emitted. Luckily, Rx.js has a method for this built in: .distinctUntilChanged will emit a value one time, but will not emit again until the value changes. That means if a file is saved 10 times with the same contents, it will emit the file contents the first time and drop the rest.
const latestFileContents$ = fileContents$.distinctUntilChanged();
latestFileContents$.subscribe(content => console.log(content));
Now we'll only see file contents logged if it's different from the last contents that were emitted. There's a logic problem here, however. Consider the following scenario:
foo.mdchangesAndAdditions$ emits "foo.md"fileContents$ emits "contents of foo.md"null) is distinct from "contents of foo.md"latestFileContents$ emits "contents of foo.md"bar.mdchangesAndAdditions$ emits "bar.md"fileContents$ emits "contents of BAR.md"latestFileContents$ emits "contents of BAR.md"foo.md againchangesAndAdditions$ emits "foo.md"fileContents$ emits "contents of foo.md"latestFileContents$ emits "contents of foo.md"As you can see, the contents of the two files never changes, but the latestFileContents$ considers it "changed" because it's different from the last value, which was from the other file. The solution is to create an observable of file contents that is distinct until changed for each file, so the new contents of foo.md are compared to the last contents of foo.md, regardless of whether bar.md was changed since then. This is a bit more complicated than merging the newFiles$ and changedFiles$ Observables, but it's doable!
Because we want one observable of file changes per file, we must perform the "read file & see if it changed" per file, not on a merged stream of all files. The plan of attack is as follows: For each add event (new file created or read on startup)...
change events for this file onlyadd event).mergeAll to unwrap Observable from step 3// for each added file...
const latestFileContents$ = newFiles$.map(addedName => {
// 1. create Observable of file changes...
const singleFileChangesAndAdditions$ = changedFiles$
// ...only taking those for THIS file
.filter(changedName => changedName === addedName)
// 2. emit filename once to start (on "add")
.startWith(addedName);
const singleFileLatestContents$ = singleFileChangesAndAdditions$
// 3. map the filename to an observable of the file contents
.map(filename => readFileAsObservable(filename, 'utf-8'))
// 4. Merge the Observable<Observable<file contents>> to Observable<file contents>
.mergeAll()
// 5. don't emit unless the file contents actually changed
.distinctUntilChanged();
// 6. return an observable of changes per added filename
return singleFileLatestContents$;
})
.mergeAll(); //unwrap per-file Observable of changes
We're using .mergeAll twice because we're mapping strings to Observables twice:
changedFiles$ mapped to an observable of file contents in step 4newFiles$ mapped to an observable returned in step 6Because we go Observable<String> to Observable<Observable<String>> twice, we have to reverse the process with .mergeAll twice.
Since we have one singleFileChangesAndAdditions$ observable per file added, we are able to perform the "map filename to contents and compare with last value" check per file. latestFileContents$ can still be consumed as it was before.
That was a lot, but it's the bulk of the Rx.js logic for our "write blog posts" goal. Now that we have an Observable that emits the contents of our Markdown blog posts each time they change, we can map that over our frontmatter, markdown parsing, template, and write-to-disk functions much as we did before with Promises. We'll start by creating a few utility functions as before:
const md = require('markdown-it')();
const frontmatter = require('frontmatter');
const pug = require('pug');
const writeFileAsObservable = Rx.Observable.bindNodeCallback(fs.writeFile);
const renderPost = pug.compileFile(`${__dirname}/templates/post.pug`);
// IN: { content, data : { title, description, ...} }
// OUT: { content, title, description, ... }
function flattenPost(post){
return Object.assign({}, post.data, { content : post.content });
}
// parse markdown to HTML then send the whole post object to the template function
function markdownAndTemplate(post){
post.body = md.render(post.content);
post.rendered = renderPost(post); //send `post` to pug render function for post template
return post;
}
// take post object with:
// 1. `slug` (e.g. "four-es6-tips") to build file name and
// 2. `rendered` contents: the finished HTML for the post
// write this to disk & output error or success message
function writePost(post){
var outfile = path.join(__dirname, 'out', `${post.slug}.html`);
writeFileAsObservable(outfile, post.rendered)
.subscribe({
next: () => console.log('wrote ' + outfile),
error: console.error
});
}
NB: see the previous post for details on frontmatter, md.render, etc.
We use our Observable utility functions to string them together:
latestFileContents$
.map(frontmatter) // trim & parse frontmatter
.map(flattenPost) // format the post for Pug templating
.map(markdownAndTemplate)// render markdown & render template
.subscribe(writePost);
Now we have a working, Rx.js version of our Static Site Generator that does the same as it did with Promises, but with built-in file watch and rebuild!

On to our next goal, the index page...
Our index page template, index.pug:
html
head
title Welcome to my blog!
body
h1 Blog Posts:
//- Output h2 with link & paragraph tag with description
for post in posts
h2: a.title(href='/' + post.slug + '.html')= post.title
p= post.description
The data our template expects must be structured thus:
{
posts : [
{ title:"Intro To Rx.js", slug: "intro-to-rx-js", description: "..."},
{ title:"Post Two", slug: "post-2", description: "..."},
//...
]
}
Earlier, we mapped the latestFileContents$ over the frontmatter function. We need to use that Observable for our index page as well, so let's modify our code from above to capture that Observable and set it aside:
const postsAndMetadata$ = latestFileContents$
.map(frontmatter);
postsAndMetadata$ //same as before:
.map(flattenPost)
.map(markdownAndTemplate)
.subscribe(writePost);
The frontmatter function returns an object with data and contents keys, but we only need the value of data for our index page template so we'll pluck that property from the object:
const metadata$ = postsAndMetadata$
.pluck('data');
At this point we have an Observable that emits latest metadata for a file when that file is created or saved. We need to transform our Observable ("collection over time") to an array ("collection over space"). Rx.js's has a reduce method that can do this, but it waits for an Observable to "complete" before emitting one final "reduced" value, and our file-watching $metadata Observable never "completes."
We need a way to aggregate values into an accumulator like reduce does, but that emits the new accumulator value on each iteration so we don't have to wait for the "complete" that will never come. Rx.js has a method called .scan that does just this:
const metadataMap$ = metadata$
.scan(function(acc, post){
acc[post.slug] = post;
return acc;
}, {});
By making the slug the keys in the acc object, there will be only one property per post. When we first start our script, we'll get a post object from $metadata with the slug post-2 and add it to the accumulator as acc['post-2'].
When post two is updated and saved, its metadata will be sent to .scan again, but it won't add a new key to acc: it will overwrite the existing acc['post-2']. In this way, metadataMap$ will emit an object containing the latest metadata for all posts, with one key per post. The output will look thus:
{
'intro-to-rx-js' : { title, description, slug },
'post-2' : { title, description, slug }
}
We now have an object with an entry for each post, but this does not match the format we outlined above ({ posts : [ post, post, post ] }). In the next two steps we can transform the object into an array and then insert it into a wrapper object:
const indexTemplateData$ = metadataMap$
.map(function getValuesAsArray(postsObject){ // (or Object.values in ES2017)
// IN: { 'slug' : postObj, 'slug2' : postObj2, ... }
return Object.keys(postsObject)
.reduce(function(acc, key){
acc.push(postsObject[key]);
return acc;
}, []);
// OUT: [postObj, postObj2, ...]
})
.map(function formatForTemplate(postsArray){
return {
posts : postsArray
};
})
Now we get have an Observable that emits the latest listing of post metadata, formatted for the index.pug template, on each add or change event. This isn't quite what we want, however, for two reasons.
First: in the course of editing a post, most of your changes will be to content, not to the metadata. Content changes don't affect the index page, so we want to drop any emissions from indexTemplateData$ where the data is the same as the previous emission. This is another case where .distinctUntilChanged comes in handy:
const distinctITD$ = indexTemplateData$
.distinctUntilChanged(function compareStringified(last, current){
//true = NOT distinct; false = DISTINCT
return JSON.stringify(last) === JSON.stringify(current);
});
We pass a comparator function to distinctUntilChanged this time because formatForTemplate (above) returns a newly created object each time-- the new object it emits will always be "distinct" from the last one, even if their contents are identical. We stringify the last and current objects in order to compare their contents and emit only when they differ.
Second: when we first start our script, it reads each file and emits its contents once. This means that while files are initially being read, indexTemplateData$ will emit a bunch of incomplete objects consisting of whatever posts have been read so far. If we have 4 posts, emissions will look like this:
{ post1 }{ post1, post2 }{ post1, post2, post3 } { post1, post2, post3, post4 }Only the last version represents a collection of metadata from all pages; the others can be ignored. In order to get around the flood of events on indexTemplateData$ on startup, we'll use .debounceTime, which will wait until an Observable stops emitting for a fixed amount of time before emitting the latest result:
const distinctDebouncedITD$ = distinctITD$
.debounceTime(100); // wait 'til the observable STOPS emitting for 100ms, then emit latest
This probably isn't the most graceful solution, but when indexTemplateData$ gets its initial flood of emissions, distinctDebouncedITD$ will only emit once, once it's finished.
The only thing left is to pass the values from distinctDebouncedITD$ to the template rendering function then write the results to disk:
const renderIndex = pug.compileFile(`${__dirname}/templates/index.pug`);
function writeIndexPage(indexPage){
var outfile = path.join(__dirname, 'out', 'index.html');
writeFileAsObservable(outfile, indexPage)
.subscribe({
next: () => console.log('wrote ' + outfile),
error: console.error
});
}
postsListing$
.map(renderIndex)
.subscribe(writeIndexPage);
Now index.html will be rewritten when we edit a post, but only if the metadata changed:

That's it!
If you made it this far, congratulations! My goal was not to explain each Rx.js concept introduced herein in detail, but to walk through the process of using Rx.js to complete a real-world programming task. I hope this was useful! If this post has piqued your interest, I highly recommend running the full version of the code this post was based on, which you can find in this repository. As always, if you have any questions or Rx.js corrections please feel free to contact me. Happy coding!
This is a great overview - thank you! What are you using to do those awesome terminal gifs?
- Mark,
Thanks, Mark! I use licecap to make gifs on my Macintosh. Tips to keep the sizes/zoom consistent:
See my post on avoiding livecoding in demos for more such tricks!
]]>It occurred to some people that it didn't make sense to run step three every single time someone hit a page on their site. If step three (combining template with page content) were done in batch beforehand, all of the site's pages could be stored on disk and served from a static server! An application that takes this approach, generating "static" webpages and storing them as flat HTML files, is referred to as a Static Site Generator (or SSG). An SSG has the following benefits over a CMS:
Points one and two dramatically reduce the attack surface of a web server, which is great for security. Point three (in conjunction with one and two) allows for greater site reliability and allows a server to handle much more traffic without crashing. Point four is very attractive from a cost perspective (as are one, two, and three if you're paying for hosting). The benefits of static site generators are clear, which is why many organization and individuals are using them, including the publisher of this blog and the author of this post!
There are many available SSG tools, one hundred and sixty two listed on a site that tracks such tools at the time of writing. One of the reasons there are so many options is that building an SSG isn't terribly complicated. The core functionality is:
I've simplified the process a bit here, but overall, this is a pretty straightforward programming task. Given libraries to do the heavy lifting of parsing markdown, highlighting code, etc., all that's left is the "read input files, process, write output files."
So can we write our own static site generator in Node.js? In this blog post we'll step through each of the steps outlined above to create the skeleton of an SSG. We'll skip over some non-page-generation tasks such as organizing images & CSS, but there's enough here to give you a good overview of what an SSG does. Let's get started!
No Wordpress means no WYSIWYG editor, so we'll be authoring our posts in a text editor. Like most static site generators, we will store our page content as Markdown files. Markdown is a lightweight markup alternative to HTML that's designed to be easy to type, human readable, and typically used to author content that will ultimately be converted to and published as HTML, so it's ideal for our purpose here. A post written in markdown might look like this:
# The Hotdog Dilemma
*Are hotdogs sandwiches*? There are [many people](https://en.wikipedia.org/wiki/Weasel_word) who say they are, including:
* Cecelia
* Donald
* James
## Further Evidence
... etc. ...
We'll put our posts in a directory called _posts. This will be like the "Posts" table in a traditional CMS, in the sense that it's where we'll look up our content when it's time to generate the site.
To read each file in the _posts directory, we need to list all the files, then read each one in turn. The node-dir package that does that for us, but the API isn't quite what we need, however, as it's callback based and oriented towards getting file names rather than compiling a array of all file contents. Creating a wrapper function that returns a Bluebird promise containing an array of all file contents is tangential to the topic of this post, but let's imagine we've done so and we have an API that looks like this:
getFiles('_posts', {match: /.*\.md/})
.then(posts){
posts.forEach(function(contents){
console.log('post contents:');
console.log(contents);
})
});
Because we're using Bluebird promises and our Promise result is an array, we can map over it directly:
getFiles('_posts', {match: /.*\.md/})
.map(function processPost(content){
// ... process the post
// ... return processed version
})
.map(nextProcessingFunction)
//...
This set up will make it easy to write functions to transform out input to our output step by step, and apply those functions, in order, to each post.
In a traditional CMS, the Posts table holds not just the contents of the post, but also metadata such as its title, author, publish date, and perhaps a permanent URL or canonical link. This metadata is used both on the post page or in a page <title> and on index pages. In our flat-file system, all the information for a post must be contained in the markdown file for that post. We'll use the same solution for this challenge that is used by Jekyll and others: YAML frontmatter.
YAML is a data serialization format that's basically like JSON but lighter weight. It looks like this:
key: value
author: Sequoia McDowell
Object:
key: http://example.com
wikipedia: https://wikipedia.com
List:
- First
- Second
- Third
"Frontmatter" on Markdown files is an idea borrowed from Jekyll. Very simply, it means putting a block of YAML at the top of your markdown file containing metadata for that file. The SSG separates this YAML data from the rest of the file (the contents) and parses it for use in generating the page for that post. With YAML frontmatter, our post looks like this:
---
title: The Hotdog Dilemma
author: Sequester McDaniels
description: Are hotdogs sandwiches? You won't believe the answer!
path: the-hotdog-dilemma.html
---
*Are hotdogs sandwiches*? There are [many people](https://en.wikipedia.org/wiki/Weasel_word) who say they are, including:
...
Trimming this bit of YAML from the top of our post and parsing it is easy with front-matter, the node package that does exactly this! That means this step is as simple as npm installing the library and adding it to our pipeline:
const getFiles = require('./lib/getFiles');
const frontmatter = require('front-matter');
getFiles('_posts', {match: /.*\.md/})
.map(frontmatter) // => { data, content }
.map(function(parsedPost){
console.log(post.data.title); // "The Hotdog Dilemma"
console.log(post.data.author); // "Sequester McDaniels"
console.log(post.content); // "*Are hotdogs sandwiches*? There are [many people](https: ..."
});
Now that our metadata is parsed and removed from the rest of the markdown content, we can work on converting the markdown to HTML.
As mentioned, Markdown is a markup language that provides an easy, flexible way to mark up documents text in a human readable way. It was created by John Gruber in 2004 and introduced in a blog post that serves as the de-facto standard for the markdown format. This blog post would go on to be referenced by others who wished to build markdown parsers in Ruby, Javascript, PHP, and other languages.
The problem with having only a "de-facto" standard for a format like markdown is that this means there is no actual, detailed standard. The result is that over the years different markdown parsers introduced their own quirks and differences in parsing behavior, as well as extensions for things like checklists or fenced code blocks. The upshot is this: there is no single "markdown" format-- the markdown you write for one parser may not be be rendered the same by another parser.
In response to this ambiguity, the CommonMark standard was created to provide "a strongly defined, highly compatible specification of Markdown." This means that if you use a CommonMark compatible parser in JavaScript and later switch to a CommonMark compatible parser in Ruby, you should get the exact same output.
The main JavaScript implementation of CommonMark is markdown-it, which is what we'll use:
const getFiles = require('./lib/getFiles');
const frontmatter = require('front-matter');
const md = require('markdown-it')('commonmark');
function convertMarkdown(post){
post.content = md.render(post.content);
return post;
}
getFiles('_posts', {match: /.*\.md/})
.map(frontmatter) // => { data, content:md }
.map(convertMarkdown) // => { data, content:html }
.map(function(post){
console.log(post.content);
// "<p><em>Are hotdogs sandwiches</em>? There are <a href="proxy.php?url=https://en.wikipedia.org/wiki/Weasel_word">many people</a> who..."
});
Now our markdown is HTML!
We're writing a technical blog, so we want to display code with syntax highlighting. If I write:
Here's a *pretty good* function:
```js
function greet(name){
return "Hello " + name;
}
```
It should be output thus:
<p>Here's a <em>pretty good</em> function:</p>
<pre><code class="language-js"><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">greet</span>(<span class="hljs-params">name</span>)</span>{
<span class="hljs-keyword">return</span> <span class="hljs-string">"Hello "</span> + name;
}
</code></pre>
These classes allow us to target each piece of the code (keywords, strings, function parameters, etc.) separately with CSS, as is being done throughout this blog post. The markdown-it docs suggest using highlight.js so that's what we'll do:
const getFiles = require('./lib/getFiles');
const frontmatter = require('front-matter');
const md = require('markdown-it')('commonmark', {
highlight: function (str, lang) {
// "language" is specified after the backticks:
// ```js, ```html, ```css etc.
// "str" is the contents of each fenced code block
return hljs.highlight(lang, str).value;
}
});
// ... unchanged ...
Now we can used fenced code blocks as above. We're almost there!
There are plenty of templating libraries in JavaScript; we'll use Pug (formerly "Jade") here. First we'll create a template for posts:
//templates/post.pug
- var thisYear = (new Date()).getFullYear();
doctype html
html(lang='en')
head
title= title
meta(name='description', content=description)
body
h1= title
| !{content}
footer © #{author} #{thisYear}
We won't dwell on the Pug syntax, but the important bits here are where are our data is injected into the template. Note in particular:
title= title for the <title> tagh1= title for the page header| !{content} to output page contents, directly in the body, without escaping HTMLNext we must create a function that uses this template file to render a "post object" to HTML.
//...
const pug = require('pug');
const postRenderer = pug.compileFile('./templates/post.pug');
//function for our posts promise pipeline:
function renderPost(){
post.content = postRenderer(post);
return post;
}
We'll also need a function to flatten the post object for Pug's consumption
// IN: { content, data : { title, description, ...} }
// OUT: { content, title, description, ... }
function flattenPost(post){
return Object.assign({}, post.data, { content : post.content });
}
Now we can plug these two new functions into our pipeline
//...
getFiles('_posts', {match: /.*\.md/})
.map(frontmatter) // => { data, content:md }
.map(convertMarkdown) // => { data, content:html }
.map(flattenPost)
.map(renderPost)
.map(post => {
console.log(post.content); // '<!DOCTYPE html><html lang="en"><head><title> ...'
console.log(post.path); // 'the-hotdog-dilemma.html'
})
Finally we're at the last step: writing posts to an output directory.
We're going to write our HTML files to a directory named out. This will contain the final output, ready to publish to a web server. Our function should, for each post, write the post.content to a path specified by post.path. Since we're using Bluebird already, we'll use the promisified version of the file system API.
//...
const Promise = require('bluebird');
const fs = Promise.promisifyAll(require('fs'));
const path = require('path');
const outdir = './out'
function writeHTML(post){
return fs.writeFileAsync(path.join(outdir, post.path), post.content);
}
Now we have a script that fulfills all of our original goals.
// requires...
// utility functions...
//Read posts & generate HTML:
getFiles('_posts', {match: /.*\.md/}) // 1
.map(frontmatter) // 2
.map(convertMarkdown) // 3
.map(flattenPost)
.map(renderPost) // 4, 5
.map(writeHTML) // 6
.then(function(){
console.log('done!');
})
.catch(function(e){
console.error('there was an error!')
console.error(e)
});
That's it!
There is a lot we did not go over in this post, such as generating an index page, file watching and automatic re-running and publishing*, but this post shows the basics of static site generation, and how the main logic can be captured on just a few dozen lines. (Admittedly, my production version is a bit more complex.)
By writing your own tool you miss out out on the reusability of existing tools, but you gain full control over your blog build and less reliance on a third party tool you don't control. For me, the tradeoff of effort for control was worth it. Perhaps it is for you too!
* My next post will go over those features and more, so stay tuned!
]]>Interactive Debuggers* are familiar to every Java developer (among others), but they are much less well known in the JavaScript world. This is unfortunate, both because debuggers can be so helpful in diagnosing logic issues, and because the debugging tools in JavaScript today are the best & easiest to use they've ever been! This post will introduce the Node.js debugging tools in VS Code in a way that's accessible to programmers who have never used a debugger before.
Debuggers are useful when writing your own, original code, but they really show their value when you're working with an unfamiliar codebase. Being able to step through the code execution line by line and function by function can save hours of poring over source code, trying to step through it in your head.
The ability to change the value of a variable at runtime allows you to play through different scenarios without hardcoding values or restarting your application. Conditional breakpoints let you halt execution upon encountering an error to figure out how you got there. Even if using a debugger isn't part of your everyday process, knowing how they work adds a powerful tool to your toolbox!
We'll be using VS Code, which has built-in Node.js debugging capabilities. In order to run code in the context of the VS Code debugger, we must first make VS Code aware of our Node.js project. For this demo I'll create a simple express app with express-generator, you can follow along by running the commands below:
$ npm install express-generator -g
$ express test-app # create an application
$ cd test-app
$ npm install
$ code . # start VS Code
With VS Code open, we need to open the Debug pane by clicking the bug icon in left sidebar menu. With the debug pane open, you may note that the gear icon at the top of the pane has a red dot over it. This is because there are currently no "launch configurations:" configuration objects which tell VS Code how to run your application. Click the gear icon, select "Node.js," and VS Code will generate a boilerplate launch configuration for you.
There are two ways to attach the VS Code debugger to your application:
node --debug-brk your-app.js and run the "Attach" launch configurationThe first approach normally requires some setup (which is beyond the scope of this post but you can read about here), but because our package.json has a run script, VS Code automagically created a launch configuration based on that script. That means we should be able to simply click the green "run" button to start our application in the debugger.

If all went well, you should see a new toolbar at the top of your screen with pause and play buttons (among others). The Debug Console at the bottom of the screen should tell you what command it ran & what output that command yielded, something like this:
node --debug-brk=18764 --nolazy bin/www
Debugger listening on port 18764
When we load http://localhost:3000 in a browser, we can see the Express log messages in this same Debug Console:
GET / 304 327.916 ms - -
GET /stylesheets/style.css 304 1.313 ms - -
Now that we have the code running with the VS Code debugger attached, let's set some breakpoints and start stepping through our code!
A breakpoint is a marker on a line of code that tells the debugger "pause execution here." To set a breakpoint in VS Code, click the gutter just to the left of the line number. I've opened routes/index.js in order to set a breakpoint in the root request listener:
Note that the breakpoints pane at the bottom left has a listing for this new breakpoint (along with entries for exceptions, which we'll talk about momentarily). Now, when I hit http://localhost:3000 in a browser again, VS Code will pause at this point & allow me to examine what's going on a that point:

With the code paused here, we can examine variables and their values in the variables pane and see how we got to this point in the code in the call stack pane. You may also have noticed that the browser has not loaded the page-- that's because it's still waiting for our server to respond! We'll take a look at each of the sidebar panes in turn, but for now, I'll press the play button to allow code execution to continue.

Now the server should send the finished page to the browser. Note that the with code execution resumed, the "play" button is no longer enabled.
In addition to breaking on a certain line each time it's executed, you can add dynamic breakpoints that pause execution only in certain circumstances. Here's a few of the more useful ones:
user.role === "admin" to your conditional breakpoint.In this pane, you can examine and change variables in the running application. Let's edit our homepage route in routes/index.js to make the title a variable:
/* GET home page. */
var ourTitle = 'Express';
router.get('/', function(req, res, next) {
res.render('index', { title: ourTitle });
});
After editing our code, we'll need to restart the debugger so it picks up the new code. We can do this by clicking the green circle/arrow button in the top toolbar. After editing a file with a breakpoint already set and restarting the debugger (as we just did), you'll also want to check that your breakpoints are still in the right spot. VS Code does a pretty good job of keeping the breakpoint on the line you expect but it's not perfect.
With our breakpoint on what's now line 7 and with the debugger restarted, let's refresh our browser. The debugger should stop on line seven. We don't see ourTitle in the variable pane right away, because it's not "Local" to that function, but expand the "Closure" section just below the "Local" section and there it is!

Double-clicking ourTitle in the Variables Pane allows us to edit it. This is a great way to tinker with your application and see what happens if you switch a flag from true to false, change a user's role, or do something else-- all without having to alter the actual application code or restart your application!
The variable pane is also a great way to poke around and see what's available in objects created by libraries or other code. For example, under "Local" we can see the req object, see that its type is IncomingMessage, and by expanding it we can see the originalUrl, headers, and various other properties and methods.
Sometimes, rather than just pausing the application, examining or altering a value, and setting it running again, you want to see what's happening in your code line by line: what function is calling which and and how that's changing the application state. This is where the "Debug Actions" menu comes in: it's the bar at the top of the screen with the playback buttons. We've used the continue (green arrow) and restart (green circle arrow) buttons so far, and you can hover over the others to see the names and associated keyboard shortcuts for each. The buttons are, from left to right:
return statement & step out to the line of code that invoked that function.While stepping through your code, there may be certain values you always want to see the current value of. A "watch expression" will run (in the current scope!) at each paused/stopped position in your code & display the return value of that expression. Hover over the Watch Expression pane and click the plus to add an expression. I want to see the user agent header of each request as well as ourTitle, whether the response object has had headers sent, and the value of 1 + 1, just for good measure, so I'll add the following watch expressions:
req.headers['user-agent']
ourTitle
res._headerSent
1 + 1
When I refresh the browser the debugger pauses once again at the breakpoint on line 7 and we can see the result of each expression:

The Call Stack Pane shows us the function calls that got us to the current position in the code when execution is paused, and allow us to step back up that stack and examine the application state in earlier "frames." By clicking the frame below the current frame you can jump to the code that called the current function. In our case, the current frame is labeled (anonymous function) in index.js [7], and the one before that the handle function in layer.js, which is a component of the Express framework:

Note that the request handling function is unnamed, hence "(anonymous function)." "Anonymous function?!" What's that? Who knows! Moral: always name your functions!
Stepping down into the Express framework is not something I do every day, but when you absolutely need to understand how you got to where you are, the Call Stack Pane is very useful!
One especially interesting use of the Call Stack Pane is to examine variables at earlier points in your code's execution. By clicking up through the stack, you can see what variables those earlier functions had in their scope, as well as see the state of any global variables at that point in execution.
There are many more features of the interactive debugger than I went over here, but this is enough to get you started. If you want to learn more, take a look at the excellent documentation from Microsoft on the VS Code Debugger and using it with Node.js. Oh, and I should probably mention that all the debugging features outlined here (and more) are built-in to Firefox as well as Chrome, should you wish to use them on browser-based code. Happy Debugging!
* There's no specific term I've found for this common collection of application debugging tools so I'm using the term "interactive debugging" in this article.
]]>debug module to help me understand some complex interactions between events in Leaflet & Leaflet.Editable. Before we go over that, however, let's lay the groundwork with a couple organizational tips that makes debug easier to use. This post assumes you have either used debug or read the previous post, so please do one of those first!
The debug module has a great namespacing feature which allows you to enable or disable debug functions in groups. It is very simple-- namespaces are separated by colons:
debug('app:meta')('config loaded')
debug('app:database')('querying db...');
debug('app:database')('got results!', results);
Enable debug functions in Node by passing the name the process via the DEBUG environment variable. The following would enable the database debug function but not meta:
$ DEBUG='app:database' node app.js
To enable both, list both names, comma separated:
$ DEBUG='app:database,app:meta' node app.js
Alternately, use a "splat" (*) to enable any debugger in that namespace. The following enables any debug function whose name starts with app::
$ DEBUG='app:*' node app.js
You can get as granular as you want with debug namespaces...
debug('myapp:thirdparty:identica:auth')('success!');
debug('myapp:thirdparty:twitter:auth')('success!');
...but don't overdo it. Personally, I try not to go deeper than two or sometimes three levels.
The "splat" character * can match a namespace at any level when enabling a debug function. Given the two debug functions above above, you can enable both thus:
$ DEBUG='myapp:thirdparty:*:auth' node app.js
The * here will match identica, twitter, or any other string.
It's frequently useful to enable all debug functions in a namespace with the exception of one or two. Let's assume we have separate debug functions for each HTTP status code that our app response with (a weird use of debug, but why not!):
const OK = debug('HTTP:200');
const MOVED = debug('HTTP:301');
const FOUND = debug('HTTP:302');
const UNAUTHORIZED = debug('HTTP:403');
const NOTFOUND = debug('HTTP:404');
// etc.
We can turn them all on with HTTP:*, but it turns out that 200 comes up way too frequently so we want it turned off. The - prefix operator can be used to explicitly disable a single debugger. Here, we'll enable all debuggers in this namespace then disable just HTTP:200:
$ DEBUG='HTTP:*,-HTTP:200' node app.js
debug() is factory function, and when you call it it returns another function, which can be called to actually write to the console (more specifically, STDERR in Node.js):
debug('abc'); // creates function, doesn't write anything
debug('foo')('bar'); // writes `foo: bar` (assuming that debugger is enabled)
If we want to reuse this debugger, we can assign the function to a variable:
var fooLogger = debug('foo');
fooLogger('bar'); // writes `foo: bar`
fooLogger('opening pod bay door...') // writes `foo: opening pod bay door...`
While it's easy to create one-off debug functions as needed as in the first example, it's important to remember that the debug module does not write anything unless that particular debugger is enabled. If your fellow developer does not know you created a debugger with the name foo, she cannot know to turn it on! Furthermore, she may create a debugger with the name foo as well, not knowing you're already using that name. For this reasons (read: discoverability), it's useful to group all such debug logging functions in one file, and export them from there:
// lib/debuggers.js
const debug = require('debug');
const init = debug('app:init');
const menu = debug('app:menu');
const db = debug('app:database');
const http = debug('app:http')
module.exports = {
init, menu, db, http
};
NB: using ES2015 object property shorthand above
This way we can discover all available debuggers and reuse debuggers across files. For example, if we access the database in customer.js & we wish to log the query, we can import that debugger & use it there:
// models/customer.js
const debugDB = require('../lib/debuggers').db;
// ...
debugDB(`looking up user by ID: ${userid}`);
db.Customer.findById(userid)
.tap(result => debugDB('customer lookup result', result))
.then(processCustomer)
//.then(...)
NB: using the Bluebird promises library's tap above.
We can later use the same debugger in another file, perhaps with other debuggers as well:
// config.js
debugDB = require('../lib/debuggers').db;
debugInit = require('../lib/debuggers').init;
// ...
debugInit('configuring application...');
if(process.env !== 'DEV'){
debugInit('env not DEV, loading configs from DB');
debugDB('reading site config from database');
db.Config.find()
.tap(debugDB)
.then(config){
configureApp(config);
}
}else{
debugInit('local environment: reading config from file');
// ...
}
Then when we're confused why the app fails on startup on our local machine, we can enable app:init (or app:*) and see the following in our console...
app:init env not DEV, loading configs from DB +1ms
...and quickly discover that a missing environment variable is what's causing our issue.
My goal was to run my newFeatureAdded function whenever a user created a new "feature" on the map. (This example is browser-based, but the approach works just as well with Node.js EventEmiters.)
When I started, I attached my newFeatureAdded function to editable:created:
map.on('editable:created', function(e){
newFeatureAdded(e.layer);
});
But it wasn't firing when I expected, so I added a debug function call to see what was going on:
map.on('editable:created', function(e){
eventDebug('editable:created', e.layer);
newFeatureAdded(e.layer);
});
This revealed that the event was fired when the user clicked "create new feature", not when they placed the feature on the map. I fixed the issue, but I found myself adding debug function calls all over the place, with almost every event handler function:
map.on('editable:drawing:commit', function(e){
eventDebug('FIRED: editable:drawing:commit');
handleDrawingCommit(e);
});
map.on('click', function(e){
eventDebug('FIRED: click');
disableAllEdits();
});
map.on('editable:vertex:clicked', function(e){
eventDebug('FIRED: editable:vertex:clicked');
handleVertexClick(e);
});
This is starting to look redundant, and doubly bad as it's forcing us to wrap our handler calls in extra anonymous functions rather than delegate to them directly, i.e. map.on('click', disableEdits). Furthermore, not knowing the event system well, I want to discover other events that fire at times that might be useful to me.
In order to build my UI, I needed to understand the interactions between Leaflet's 35 events and Leaflet.Editable's 18 events, which overlap, trigger one another, and have somewhat ambiguous names (layeradd, dragend, editable:drawing:dragend, editable:drawing:end, editable:drawing:commit, editable:created etc.).
We could pore over the docs and source code to find the exact event we need for each eventuality... or we could attach debug loggers to all events and see what we see!
The approach is as follows:
.on// 1. Create list of events
const leafletEditableEvents = [
'editable:created',
'editable:enable',
'editable:drawing:start',
'editable:drawing:end',
'editable:vertex:contextmenu',
// ...
];
const leafletEvents = [
'click',
'dblclick',
'mousedown',
'dragend',
'layeradd',
'layerremove',
// ...
];
Because we want to be able to use our event debugging tool on any event emitter, we'll make a function that takes the target object and events array as arguments:
function debugEvents(target, events){
events
// 2. Create debug function for each
// (but keep the function name as well! we'll need it below)
// return both as { name, debugger }
.map(eventName => { return { name: eventName, debugger: debug(eventName) }; })
// 3. Attach that function to the target
.map(event => target.on(event.name, event.debugger));
}
debugEvents(mapObject, leafletEditableEvents);
debugEvents(mapObject, leafletEvents);
Assuming we set localStorage.debug='*' in our browser console, we will now see a debug statement in the console when any of the Leaflet.Editable events fire on the map object!

Note that whatever data is passed to an event handler attached with .on() is passed to the our debug functions. In this case it's the event object created by Leaflet, shown above in the console as ▶ Object.
mousemove etc. are not in any namespace above, and it's best to always namespace debug functions so they don't collide, to add context, and to allow enabling/disabling by namespace. Let's improve our debugEvents function to use a namespace:
function debugEvents(target, events, namespace){
events
.map(eventName => { return {
name: eventName,
debugger: debug(`${namespace}:${eventName}`)
} } )
.map(event => target.on(event.name, event.debugger));
}
//editable events already prefixed with "editable", so "events:editable:..."
debugEvents(mapObject, leafletEditableEvents, 'event');
//map events not prefixed so we'll add `map`, so they're "events:map:..."
debugEvents(mapObject, leafletEvents, 'event:map');
We can enable all event debuggers in our console, or just editable events, or just core map events, thus:
> localStorage.debug = 'event:*'
> localStorage.debug = 'event:editable:*'
> localStorage.debug = 'event:map:*'
Conveniently, the Leaflet.Editable events are all already "namespaced" & colon separated, just like our debug namespaces!
> localStorage.debug = 'event:editable:*' //enable all editable
> localStorage.debug = 'event:editable:drawing:*' //just editable:drawing events
Let's enable all event debuggers and see what some interactions look like...

Looks nice, but the mousemove events are coming so fast they push everything else out of the console, i.e. they are noise. Some trial and error taught me it that drag events are equally noisy and that I don't need to know the core map events most of the time, just the editable events.
With this info we can tune our logging down to just what we need, enabling only editable: events & ignoring all drag & mousemove events:
> localStorage.debug = 'event:editable:*,-event:*:drag,-event:*:mousemove'

Looks good!
While debug is a very small & easy-to-get-started-with module, it can tuned in very granular ways and is a powerful development tool. By attaching debug statements to all events, outside of our application code, we can trace the path of an event system & better understand how events interact, without adding any debug statments into our application code. If you've found another novel use of this library or have any questions about my post, let me know. Happy logging!
NB: I use the term "debugger function" and "debug logging" rather than "debugger" and "debugging" in this post advisedly. A "debugger" typically refers to a tool that can be used to pause execution & alter the code at runtime, for example the VSCode debugger. What we're doing here is "logging."
]]>debug module recently for a web map project. I needed to understand the somewhat complex interactions between events in Leaflet.js in order to figure out what events to attach to... but that's the next post. Before I get to that, I want to go over the debug module itself.
console.log: the JavaScript programmer's oldest friend*. console.log was probably one of the first things you learned to use to debug JavaScript, and while there are plenty of more powerful tools, console.log is still useful to say "event fired", "sending the following query to the database...", etc..
So we write statements like console.log(`click fired on ${event.target}`). But then we're not working on that part of the application anymore and those log statements just make noise, so we delete them. But then we are working on that bit again later, so we put them back-- and this time when we're finished, we just comment them out, instead of moving them. Before we know it our code looks like this:
fs.readFile(usersJson, 'utf-8', function (err, contents){
// console.log('reading', usersJson);
if(err){ throw err; }
var users = JSON.parse(contents);
// console.log('User ids & names :');
// console.log(users.map(user => [user.id, user.name]));
users.forEach(function(user){
db.accounts.findOne({id: user.id}, function(err, address){
if(err){ throw err; }
var filename = 'address' + address.id + '.json';
// console.log(JSON.parse('address'));
// console.log(`writing address file: ${filename}`)
fs.writeFile(filename, 'utf-8', address, function(err){
if(err){ throw err; }
// console.log(filename + ' written successfully!');
});
});
});
});
What if, instead of commenting out or deleting our useful log statements when we're not using them, we could turn them on when we need them and off when we don't? This is a pretty simple fix:
function log(...items){ //console.log can take multiple arguments!
if(typeof DEBUG !== 'undefined' && DEBUG === true){
console.log(...items)
}
}
NB: Using ES6 features rest parameters and spread syntax in this function
Now we can replace our console.log() statements with log(), and by setting DEBUG=true or DEBUG=false in our code, we can turn logging on or off as needed! Hooray! Well, actually, there are still a couple problems...
In our current system, DEBUG must be hardcoded, which is bad because
We can fix that by setting DEBUG to true or false somewhere outside our script, and reading it in. In node it would make sense to use an environment variable:
const DEBUG = process.env.DEBUG; // read from environment
function log(...items){
// ...
Now we can export DEBUG=true on our dev machine to turn it on all the time. Alternately, we can turn /j #it on by setting an environment variable just for one process when we launch it (shell command below):
$ DEBUG=true node my-cool-script.js
If we want to use our debugger in the browser, we don't have process.env, but we do have localStorage:
var localEnv; //where do we read DEBUG from?
if(process && process.env){ //node
localEnv = process.env;
}else if(window && window.localStorage) { //browser
localEnv = window.localStorage;
}
const DEBUG = localEnv.DEBUG;
function log(...items){
// ...
Now we can set DEBUG in localStorage using our browser console...
> window.localStorage.DEBUG = true;
...reload the page, and debugging is enabled! Set window.localStorage.DEBUG to false & reload and it's disabled again.
With our current setup, we can only chose "all log statements on" or "all log statements off." This is OK, but if we have a big application distinct parts, and we're having a database problem, it would be nice to just turn on database-related debug statements, but not others. If we only have one debugger and one debug on/off switch (DEBUG), this isn't possible, so we need:
Let's tackle the second problem first. Instead of a boolean, let's make debug an array of keys, each representing a debugger we want turned on:
DEBUG = ['database']; // just enable database debugger
DEBUG = ['database', 'http'];// enable database & http debuggers
DEBUG = undefined; // don't enable any debuggers
We can't set arrays as environment variables, but we can set it to a string...
$ DEBUG=database,http node my-cool-script.js
...and it's easy to build an array from a string...
// process.env.DEBUG = 'database,http'
DEBUG = localEnv.DEBUG.split(',');
DEBUG === ['database', 'http'] // => true
Now we have an array of keys for debuggers we want enabled. The simplest way to allow us to enable just http or just database debugging would be to add an argument to the log function, specifying which "key" each debug statement should be associated with:
function log(key, ...items){
if(typeof DEBUG !== 'undefined' && DEBUG.includes(key)){
console.log(...items)
}
}
log('database','results recieved'); // using database key
log('http','route not found', request.url); // using http key
NB: Array.prototype.includes only exists in newer environments.
Now we can enable enable and disable http and database debug logging separately! Passing a key each time is a bit tedious however, so let's revisit the proposed solution above, "Multiple debug functions." To create a logHttp function, we basically need a pass-through that takes a message and adds the http "key" before sending it to log:
function logHttp(...items){
log('http', ...items);
}
logHttp('foo'); // --> log('http', 'foo');
Using higher-order functions (in this case a function that returns a function), we can make a "factory" to produce debugger functions bound to a certain key:
function makeLogger(fixedKey){
return function(...items){
log(fixedKey, ...items)
}
}
Now we can easily create new "namespaced" log functions and call them separately:
const http = makeLogger('http');
const dbDebug = makeLogger('database');
dbDebug('connection established'); // runs if "database" is enabled
dbDebug('Results recieved'); // runs if "database" is enabled
http(`Request took ${requestTime}ms`); // runs if "http" is enabled
That gets us just about all the way to the debug module! It has a couple more features than what we created here, but this covers the main bits. I use the debug module in basically all projects & typically start using it from day 1: if you never put console.log statements in your code you have nothing to "clean up," and those debug log statements you make during active development can be useful later on, so why not keep them?
Next steps: go check out the the debug module. In the next post I'll go over some advanced usage. Thanks for reading!
*second oldest ;)
]]>
On projects of any size, code hinting reduces typos, makes coding easier, and obviates the need to check a module's documentation every few minutes. Programmers who use strongly typed languages like Java and IDEs like Eclipse take this sort of automated code-assistance for granted. But what about programmers who use JavaScript?
JavaScript is weakly typed, so when you declare var animals;, there's no way to know whether animals will be an array, a string, a function, or something else. If your IDE or editor doesn't know that animals will eventually be an array, there's no way for it to helpfully tell you that animals has the property length and the method map, among others. There's no way for the IDE to know it's an array... unless you tell it!
In this post we'll look at a couple ways to clue your IDE in to the types of the variables, function parameters, and return values in your program so it clue you in on how they should be used. We'll go over two ways to "tell" your IDE (and other developers) what types things are, and see how to load type information for third party libraries as well. Before we start writing type annotations, however, let's make sure we have a tool that can read them.
The first thing we'll need is a code editor that recognizes & supports the concept of "types" in JavaScript. You can either use a JavaScript oriented IDE such as Webstorm or VisualStudio Code, or if you already have a text-editor you like, you can search the web to find out if it has a type hinting plugin that supports JavaScript. There's one for Sublime and Atom, among others.
If the goal is getting type hinting in JavaScript (and it is here), I use & recommend Visual Studio Code, the following reasons:
With VS Code installed, let's create a new project and get started!
I've used npm init to start a new JavaScript project. At this point, we already get quite a bit from our IDE, which has JavaScript APIs (Math, String, etc.) and browser APIs (DOM, Console, XMLHttpRequest etc.) built in.
Here's some of what we get out of the box:

Nice! But we're more interested in Node.js annotations and sadly, VS Code does not ship with those. Type declarations for Node.js core APIs do exist, however, in the form of Typescript declaration files. We just need a way to add them to our workspace so VS Code can find them. Enter Typings.
Typings is a "Typescript Definition Manager", which means it helps us install the Typescript Definitions (or "Declarations") we need for our IDE to know what the JavaScript APIs we're working with look like. We'll look more at the format of Typescript Declarations later, for now we'll stay focused on our goal of getting our IDE to recognize Node.js core APIs.
Install typings thus:
$ npm install --global typings
With typings installed on our system, we can add those Node.js core API type definitions to our project. From the project root:
$ typings install dt~node --global --save
Let's break that command down:
install the node package...dt~, the DefinitelyTyped repository, which hosts a huge collection of typescript definitions--global switch because we want access to definitions for process and modules from throughout our project--save switch causes typings save this type definition as a project dependency in a typings.json, which we can check into our repo so others can install these same types. (typings.json is to typings install what package.json is to npm install.)Now we have a new typings/ directory containing the newly downloaded definitions, as well as our typings.json file.
We now have these type definitions in our project, and VS Code loads all type definitions in your project automatically. However, it identifies the root of a JavaScript project by the presence of a jsconfig.json file, and we don't have one yet. VS Code can usually guess if your project is JavaScript based, and when it does it will display a little green lightbulb in the status bar, prompting you to create just such a jsconfig.json file. Click that button, save the file, start writing some Node and...

It works! We now get "Intellisense" code hints for all Node.js core APIs. Our project won't just be using Node core APIs though, we'll be pulling in some utility libraries, starting with lodash. typings search lodash reveals that there's a lodash definition from the npm source as well as global and dt. We want the npm version since we'll be consuming lodash as module included with require('lodash') and it will not be globally available.
$ typings install --save npm~lodash
[email protected]
└── (No dependencies)
$ npm install --save lodash
[email protected] /Users/sequoia/projects/typehinting-demo
└── [email protected]
Now we can require lodash and get coding:

So far we've seen how to install and consume types for Node and third party libraries, but we're going to want these annotations for our own code as well. We can achieve this by using JSDoc comments, writing our own Typescript Declaration files, or a combination of both.
JSDoc is a tool that allows us to describe the parameters and return types of functions in JavaScript, as well as variables and constants. The main advantages of using JSDoc comments are:
There are many annotations JSDoc supports, but you can get a long way just by learning a few, namely @param and @return. Let's annotate this simple function, which checks whether one string contains another string:
function contains(input, search){
return RegExp(search).test(input);
}
contains('Everybody loves types. It is known.', 'known'); // => true
With a function like this, it's easy to forget the order of arguments or their types. Annotations to the rescue!
/**
* Checks whether one string contains another string
*
* @param {string} input - the string to test against
* @param {string} search - the string to search for
*
* @return {boolean}
*/
function contains(input, search){
return RegExp(search).test(input);
}
While writing this, we realized it that this function actually works with regular expressions as the search parameter as well as strings. Let's update that line to make clear that both types are supported:
/**
* ...
* @param {string|RegExp} search - the string or pattern to search for
* ...
*/
We can even add examples & links to documentation to help the next programmer out:
/**
* Checks whether one string contains another string
*
* @example
* ```
* contains("hello world", "world"); // true
* ```
* @example
* ```
* const exp = /l{2}/;
* contains("hello world", exp); // true
* ```
* @see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp
*
* @param {string} input - the string to test against
* @param {string|RegExp} search - the string or pattern to search for
*
* @return {boolean}
*/
...and away we go!

JSDoc works great and we've only scratched the surface of what it can do, but for more complex tasks or cases where you're documenting a data structure that exists e.g. in a configuration file, typescript declaration files are often the better choice.
A typescript declaration file uses the extension .d.ts and describes the shape of an API, but does not contain the actual API implementation. In this way, they are very similar to the Java or PHP concept of an Interface. If we were writing Typescript, we would declare the types of our function parameters and so on right in our code, but JavaScript's lack of types makes this impossible. The solution: declare the types in an JavaScript library in a Typescript (definition) file that can be installed alongside the JavaScript library. This is the reason we installed the lodash type definitions separately from lodash.
Setting up external type definitions for an API you plan to publish and and registering them on the typings repository is a more involved task that we'll cover today, but you can read up about it here. For now, let's consider the case of a complex configuration file.
Imagine we have an application that creates a map and allows users to add features to that map. We'll be deploying these editable maps to different client sites, so we want be able to configure the types of features users can add and the coordinates to center the map on on a per-site basis.
Our config.json looks like this:
{
"siteName": "Strongloop",
"introText": {
"title": "<h1> Yo </h1>",
"body": "<strong>Welcome to StrongLoop!</strong>"
},
"mapbox": {
"styleUrl": "mapbox://styles/test/ciolxdklf80000atmd1raqh0rs",
"accessToken": "pk.10Ijoic2slkdklKLSDKJ083246ImEiOi9823426In0.pWHSxiy24bkSm1V2z-SAkA"
},
"coords": [73.153,142.621],
"types": [
{
"name": "walk",
"type": "path",
"lineColor": "#F900FC",
"icon": "test-icon-32.png"
},
{
"name": "live",
"type": "point",
"question": "Where do you live?",
"icon": "placeLive.png"
}
...
We don't want to have to go read over this complex JSON file each time we want to find the name of a key or remember the type of a property. Furthermore, it's not possible to document this structure in the file itself because JSON does not allow comments.* Let's create Typescript Declaration called config.d.ts to describe this config object, and put it in a directory in our project called types/.
declare namespace Demo{
export interface MapConfig {
/** Used as key to ID map in db */
siteName: string;
mapbox: {
/** @see https://www.mapbox.com/studio/ to create style */
styleUrl: string;
/** @see https://www.mapbox.com/mapbox.js/api/v2.4.0/api-access-tokens/ */
accessToken: string;
};
/** @see https://www.mapbox.com/mapbox.js/api/v2.4.0/l-latlng/ */
coords: Array<number>;
types : Array<MapConfigFeature>;
}
interface MapConfigFeature {
type : 'path' | 'point' | 'polygon';
/** hex color */
lineColor?: string;
name : string;
/** Name of icon.png file */
icon: string;
}
}
You can read more in the Typescript docs about what all is going on here, but in short, this file:
Demo namespace, so we don't collide with some other MapConfig interfacetypes property of the first interface as an array whose members are MapConfigFeaturesMapConfig so we can reference it from outside the file.VS Code will load the file automatically because it's in our project, and we'll use the @type annotation to mark our conf object as a MapConfig when we load it from disk:
/** @type {Demo.MapConfig} */
const conf = require('./config.js');
Now we can access properties of the configuration object & get the same code-completion, type info, and documentation hints! Note how in the following gif, VS Code identifies not only that conf.types is an array, but when we call an .filter on it, knows that each element in the array is a MapConfigFeature type object:

I have been very much enjoying the benefits of JSDoc, Typescript Declarations and the typings repository in my work. Hopefully this article will help you get started up and running with type hinting in JavaScript. If you have any questions or corrections, or if this post was useful to you, please let me know!
* There is in fact a way to document the properties of json files, I hope to write about it in the future!
]]>"Well trust me, it works."
Live coding! The worst! We’ve all seen the previous scenario play out in a presentation, perhaps you’ve even been a victim. But what can a presenter do? The code demos have multiple steps, there are terminal interactions and the tool relies on an internet connection. How can we fit all that into our slides?? Fear not! I’ll go over some tips to keep your presentation running smoothly without sacrificing a calf to the Demo Gods beforehand. In this post, we’ll look at coding demos: demos where you’re writing code for attendees to see on a projector.
It’s not uncommon, when demoing a programming practice or tool, to do a “build-it-up” type demo, where you start with something simple & add functionality through the presentation. This is a great pattern for demoing frameworks & libraries, but many steps means many opportunities for mistakes. I lean on two main approaches to handling these issues.
The first solution for this one is the most simple: tagged steps in a git repo. By coding beforehand & tagging each step, you have the flexibility to live code if you want (you can always jump to the next step if you get off track), or don’t code at all and just check out step after step.
This approach requires some preparation! While you can retrofit this approach onto an existing repo, it’s much easier if you plan things out ahead of time. Here’s my approach:
When you’re done, you’ll have something like this:

Now you can still code in your demos, but if something goes awry, you have a safety net: git checkout the next step & you’re back to working code and ready to move on! Bonus: attendees can now peruse your examples at their leisure after the talk.
Example: in this talk, each of the titles in the “steps” slides are links to a tag on github.
Depending on the nature of your presentation, you may not have the ability or the time to actually step through the code and look at it in an editor. For fast-paced talks, it can be useful to have everything in your slides, code included.
Putting code in slides can go very wrong, however: if you dump a big block of code on the screen, attendees don’t know where to look and it can be hard to tell what’s important. This, for example, is a mess:

Where am I supposed to look? Which part of this is important? I can’t read all that!!!
My solution to this is two-fold:
By fading code in line by line or in blocks, it simulates “writing” the code, and it presents lines of code at a rate people can actually process. Highlighting simply calls attention to the important bits, so people know where to look. In the example above, which relates to adding progress bars to a file-download feature, the steps I was going over were:
With transitions in place, I was able to go thru each of the steps above one by one, “building up” the example in bite-sized pieces:

NB: This gif cycles thru the steps much more quickly that you would on stage.
This is one of my favorite approaches to live-coding (fake it! :p), but it can be difficult to set up. In the example above, there is a separate text-box for each of the portions to fade in, painstakingly positioned to appear as one big file. Another challenge was getting highlighted code into google slides at all: when you copy code from most editors or IDEs, they don’t bring syntax highlighting along. I found that PHPStorm did allow you to copy code with syntax highlighting, so I opened files there any time I needed to copy with highlighting. Yes, all this was time consuming. 🙂
This approach is also possible in tools like Reveal.js, but not using markdown. In order to achieve line-by-line fade-ins with Reveal.js, you’d need to first convert the code examples to HTML, then manually add the fragment class to elements you wish to fade in. I haven’t tried it but I believe it would work– if you have done this, please let me know!
That’s all for today! In part II of this post, we’ll look at how to demo command-line tools and interactions on stage… without opening a terminal. Stay tuned!
]]>Al’s Appliances is a retail chain that specializes in ACME products. Al’s has a website where customers can order products, but getting replacement parts for products is more complicated.
To wit:
Yikes! Al’s IT team would like to build an interface to support the following workflow:
Their team can build the web UI, but it’s up to us to tie the data together & expose it via an API.
We have the following assets to work with:
Parts lists (CSVs): ACME delivers CSV files, named by product number & containing a list of part names & SKUs. For example:
//mvwave_0332.csv//
door handle,8c218
rotator base,f74af
rotator axel,15b4c
...,...
These CSVs are the “single source of truth” for parts info and they’re sometimes updated or replaced. Business processes for other departments rely on them, so unfortunately we must do the same (we must use the CSVs, moving the data to a database is not an option).
Parts API: We’re in luck: ACME exposes a rudimentary API to access part information, so we don’t have to scrape the website! Unfortunately, it’s very simple and only exposes one endpoint to look up a single part at a time:
//GET api.acme.com/parts/f74af
{
"name": "rotator base",
"sku": "f74af",
"qty_avail": 0,
"price": "2.32"
}
IT has requested an API that exposes the following endpoints:
/v1/products → Array of products/v1/products/{id} → Object representing a single product/v1/products/{id}/parts → Array of parts for a product/v1/parts/{sku} → Object representing a single partGiven this nonstandard, somewhat complicated data architecture, why not build a 100% custom solution instead of using LoopBack, which best shows its strengths with more structured data? Using LoopBack here will require us to go “off the beaten path” a bit, but in return we get…
That last point is important, as it will allow us to eventually replace the directory of CSVs with a database table, once the business is ready for this, without major rewrites. Plugging into the LoopBack ecosystem gives us access to ready solutions for auth, data transformation, logging, push notification, throttling etc. when our requirements grow or change. Broadly speaking we’ll be building a highly extensible, highly maintainable application that can serve as a foundation for future projects, and this is makes LoopBack a good choice.
Setting Up To get started we’ll install Strongloop tools
$ npm install -g strongloop
and scaffold a new LoopBack application in a new directory.
$ slc loopback als-api
Now we can switch to the new als-api directory and generate our models. We’ll keep them server-only for now, we can easily change that later.
$ cd als-api
$ slc loopback:model
? Enter the model name: Product
? Select the data-source to attach Product to: db (memory)
? Select model's base class PersistedModel
? Expose Product via the REST API? Yes
? Custom plural form (used to build REST URL): n
? Common model or server only? server
Let’s add some Product properties now.
? Property name: name
invoke loopback:property
? Property type: string
? Required? Yes
...etc...
NB: You can see a detailed example of this process here.
Once we finish this process, we have models for Product, Part, and PartsList, with corresponding js and json files in server/models/. The PartsList is a join model that connects a Product to its Parts. That model requires some custom code, so we’ll save that bit for last and start by wiring the Product and Part model to their datasources.
Our generated server/models/product.json:
{
"name": "Product",
"properties": {
"name": {
"type": "string",
"required": true
}
},
"description": {
"type": "string",
"required": true
},
"id": {
"type": "string",
"required": true
}
},
. . .
}
The products are in a SQL database (SQLite for our example). There are three steps to connecting the model to its data:
Install the appropriate connector. Loopback has many data connectors but only the “in memory” database is bundled. The list of StrongLoop supported connectors doesn’t include SQLLite, but the list of community connectors indicates that we should install “loopback-connector-sqlite”:
$ npm install --save loopback-connector-sqlite
Create a datasource using that connector. To create a sqlite datasource called “products,” we add the following to server/datasources.json:
"products": {
"name": "products",
"connector": "sqlite",
"file_name": "./localdbdata/
local_database.sqlite3",
"debug": true
}
In our local setup our sqlite database resides in ./localdbdata/ we can later add another configuration for the production environment.
Connect the model to the datasource.
/server/modelconfig.json manages this:
"Product": {
"dataSource": "products",
"public": true
},
There is an additional step for this particular connector, specifying which field is the primary key. We do this by adding "id": true to a property in /server/models/product.json:
. . .
"properties": {
. . .
"id": {
"type": "string",
"id": true,
"required": true
}
},
. . .
Before we start our server to see if this works, let’s update the server configuration to expose the API on /v1/ rather than the default path (/api/) in server/config.json:
. . .
"restApiRoot": "/v1",
"host": "0.0.0.0",
. . .
The API will now be served from /v1/ per IT’s specifications. Now we can start our server…
$ npm start
and start querying products from http://localhost:3000/
//GET /v1/products
[
{
"name": "Microwelle Deluxe",
"description": "The very best microwave money can buy",
"id": null
},
{
"name": "Microwelle Budget",
"description": "The most OK microwave money can buy",
"id": null
},
. . .
]
Uhoh! The ids are strings and idInjection makes LoopBack treat them as numbers. Let’s fix that in server/models/product.json:
. . .
"idInjection": false,
. . .
Now let’s try again:
//GET /v1/products
[
{
"name": "Microwelle Deluxe",
"description": "The very best microwave money can buy",
"id": "microwelle_010"
},
{
"name": "Microwelle Budget",
"description": "The most OK microwave money can buy",
"id": "microwelle_022"
},
. . .
]
//GET /v1/products/microwelle_010
{
"name": "Microwelle Deluxe",
"description": "The very best microwave money can buy",
"id": "microwelle_010"
}
That’s better! Our Products are now being served so Endpoints 1 (/v1/products) and 2 (/v1/products/{id}) are working. Now let’s configure our Parts datasource and set up Endpoint 4 (/v1/parts/{sku}).
Part
Our generated server/models/part.json:
{
"name": "Part",
"properties": {
"sku": {
"type": "string",
"required": true
},
"qty_avail": {
"type": "number",
"required": true
},
"price": {
"type": "number",
"required": true
},
"name": {
"type": "string",
"required": true
}
}
. . .
}
We’ll need to follow the same three steps to connect the Parts model its datasource, a remote server this time.
Install connector:
$ npm install --save loopback-connector-rest
Create Datasource: Because there’s no universal standard for what parameters REST endpoints take, how they take them (query, post data, or part of URL), or what sort of data they return, we must configure each method manually for a REST datasource.
//server/datasources.json:
. . .
"partsServer": {
"name": "partsServer",
"connector": "rest"
"operations": [{
"template": {
"method": "GET",
"url": "http://api.acme.com/parts/{sku}",
"headers": {
"accepts": "application/json",
"contenttype": "application/json"
}
},
"functions": {
"findById": ["sku"]
}
}]
}
. . .
This will create a method called findById on any model attached to this datasource. That method takes one parameter (sku) that will be plugged into the url template. Everything else here is default.
We named the “operation” findById to conform to LoopBack convention. Because it has this name, LoopBack will know to exposed the method on /v1/parts/{id} .
Connect the model to the datasource. /server/modelconfig.json:
. . .
"Part": {
"dataSource": "partsServer",
"public": true
},
. . .
Let’s restart the server and try it out:
//GET /v1/parts/f74af
{
"name": "rotator base",
"sku": "f74af",
"qty_avail": 0,
"price": "2.11"
}
Endpoint 4 (/v1/parts/{sku}) is now working! It’s just a passthrough to the ACME API right now, but this has advantages: we can set up logging, caching, etc., we don’t have to worry about CORS, and if ACME makes a breaking API change, we can fix it in one place in our server code and clients are none the wiser.
With the easy parts out of the way, it’s time to tackle our CSVs…
Although the part lists CSVs contain product names, we’re relying on the remote server for this, so the CSVs are being used as simple many-to-many join tables. Many-to-many tables don’t generally need their own model, so why are we creating one in this case? There are two reasons:
product_id, sku pairs, we have a bunch of files named like {product_id}.csv that contain lists of skus. This will require custom join logic, and,If we stop using CSVs in the future we can delete this model and update the relationship configurations on Product, and that model can continue working without changes.
We’re going to use a hasManyThrough relationship to tie Products to their Parts, and because we’re not concerned with the part name in the PartsList, our partslist.json is does not specify any properties:
{
"name": "PartsList",
"base": "PersistedModel",
"properties": {
},
. . .
}
We’re not exposing PartsLists directly via the API, just using them for Endpoint 3 (/v1/products/{id}/parts), so we’ll just set it up to support this relationship. This first step here is to add the relationship from Product to Part, which we can do using the relationship generator:
$ slc loopback:relation
? Select the model to create the relationship from: Product
? Relation type: has many
? Choose a model to create a relationship with: Part
? Enter the property name for the relation: parts
? Optionally enter a custom foreign key:
? Require a through model? Yes
? Choose a through model: PartsList
Now when we hit /v1/products/thing_123/parts, LoopBack will attempt to figure out what Parts are related to our Product by calling find on the join model, more or less like this:
PartsList.find(
{
where: { productId: 'thing_123' },
include: 'part',
collect: 'part'
},
{},
function callback(err, res){ /*...*/ }
);
How will we make this work? We’ll definitely need to read CSVs from the filesystem, so let’s get that configuration out of the way.
##Configuration
Our PartsList CSVs exist in /vol/NAS_2/shared/parts_lists but of course we don’t wish to hardcode this path in our model. Instead, we’ll put it into a local config file where it can easily be overridden in other environments:
//server/config.local.json:
{
'partsListFilePath' : '/vol/NAS_2/shared/parts_lists'
}
PartsList.findWe know that when querying related models, LoopBack will call find on the “through” model (aka join model), so we’ll override PartsList.find and make it:
thing_123.csvskus.findOne on each skuWe’ll need to override the method in server/models/partslist.js. To override a data access method like this, we listen for the attached event to fire then overwrite the method on the model. We’ll be using two node modules to help: async to manage “wait for multiple async calls (calls to ACME API) to finish then call our done callback with the results,” and csvparse to parse our CSVs:
//server/model/partslist.js:
var fs = require('fs');
var async = require('async'); //npm install!
var csvParse = require('csvparse');//npm install!
var path = require('path');
module.exports = function(PartsList) {
PartsList.on('attached', function(app){
PartsList.find = function(){
//variable arguments, filter always first callback always last
var filter = arguments[0];
var done = arguments[arguments.length-1];
//0. build the filename
var filename = filter.where.productId + '.csv';
var csvPath = path.join(app.get('partsListFilePath'),
filename);
//1. read the csv
fs.readFile(csvPath, 'utf-8', function getParts(err, res){
if(err) return done(err);
//parse the csv contents
csvParse(res, function(err, partlist){
if(err) return done(err);
//2. get the skus from ['part name', 'sku'] tuples
var skus = partlist.map(function getSku(partTuple){
return partTuple[1];
});
//3. call Part.findOne on each sku
async.map(skus, app.models.Part.findById, function (err,
parts){
if(err) return done(err);
//4. pass an array of Parts to the callback
done(null, parts);
});
});
});
};
});
};
This could certainly be broken up into named functions for easier reading, but it works and for our purposes that’s good enough! One issue, however, is that the repeated calls to Part.findById is a “code smell:” we have Part logic (get all Parts by list of skus) in the PartsList model. It would be much better to pass our array of skus to a Part method and let it handle the details. Let’s change step (3) above so it looks like this:
//3. pass our list of SKUs and `done` callback to Part.getAll
app.models.Part.getAll(skus, done);
//4. pass an array of Parts to the callback
// ^-- this happens in Part.getAll
Now we add this new method to Part:
//server/model/part.js:
var async = require('async');
module.exports = function(Part) {
Part.getAll = function(skus, cb) {
async.map(skus, Part.findById, function (err, parts){
if(err) return cb(err);
cb(null, parts);
});
}
};
Now our Parts logic is nicely encapsulated in the Part model & the logic in our PartsList model is a bit simpler. Let’s give our last API endpoint a try:
//GET /v1/Products/mvwave_0332/parts
[
{
"name": "door handle",
"sku": "8c218",
"qty_avail": 0,
"price": "1.22"
},
{
"name": "rotator base",
"sku": "f74af",
"qty_avail": 0,
"price": "8.35"
},
{
"name": "rotator axel",
"sku": "15b4c",
"qty_avail": 0,
"price": "2.32"
}
]
It works!
We managed to tie together a motley collection of data sources, represent them with LoopBack models, and expose them on an API built to IT’s specifications. That’s a good stopping point for now. Obvious next steps would be to disable unused methods (this API is read-only, after all), build a client to interact with our API, and to set up auth if needed. By using LoopBack to build our API, we’ve positioned ourselves to be able to complete these tasks easily. We can now answer my initial question with greater confidence: yes, LoopBack can do it!
Want to see all this stuff actually work? Check out the demo app!
]]>A higher order function is a function that does one or both of the following:
The purpose of this post is not to convince you to adopt this new style right away, although I certainly encourage you to give it a try! The purpose is to familiarize you with this style, so that when you run into it in someone’s ES6-based library, you won’t sit scratching your head wondering what you’re looking at as I did the first time I saw it. If you need a refresher in arrow syntax, check out this post first.
Hopefully you’re familiar with arrow functions that return a value:
const square = x => x * x;
square(9) === 81; // true
But what’s going on in the code below?
const has = p => o => o.hasOwnProperty(p);
const sortBy = p => (a, b) => a[p] > b[p];
What’s this “p returns o returns o.hasOwnProperty…”? How can we use has?
To illustrate writing higher order functions with arrows, let’s look at a classic example: add. In ES5 that would look like this:
function add(x){
return function(y){
return y + x;
};
}
var addTwo = add(2);
addTwo(3); // => 5
add(10)(11); // => 21
Our add function takes x and returns a function that takes y which returns y + x. How would we write this with arrow functions? We know that…
…so all we must do is make the body of our arrow function another arrow function, thus:
const add = x => y => y + x;
// outer function: x => [inner function, uses x]
// inner function: y => y + x;
Now we can create inner functions with a value bound to x:
const add2 = add(2);// returns [inner function] where x = 2
add2(4); // returns 6: exec inner with y = 4, x = 2
add(8)(7); // 15
Our add function isn’t terribly useful, but it should illustrate how an outer
function can take an argument (x) and reference it in a function it returns.
So you’re looking at an ES6 library on github and encounter code that looks like this:
const has = p => o => o.hasOwnProperty(p);
const sortBy = p => (a, b) => a[p] > b[p];
let result;
let users = [
{ name: 'Qian', age: 27, pets : ['Bao'], title : 'Consultant' },
{ name: 'Zeynep', age: 19, pets : ['Civelek', 'Muazzam'] },
{ name: 'Yael', age: 52, title : 'VP of Engineering'}
];
result = users
.filter(has('pets'))
.sort(sortBy('age'));
What’s going on here? We’re calling the Array prototype’s sort and filter methods, each of which take a single function argument, but instead of writing function expressions and passing them to filter and sort, we’re calling functions that return functions, and passing those to filter and sort.
Let’s take a look, with the expression that returns a function underlined in each case.
result = users
.filter(x => x.hasOwnProperty('pets')) //pass Function to filter
.sort((a, b) => a.age > b.age); //pass Function to sort
result = users
.filter(has('pets')) //pass Function to filter
.sort(sortBy('age')); //pass Function to sort
In each case, filter is passed a function that checks if an object has a property called “pets.”
This is useful for a few reasons:
Imagine we want only users with pets and with titles. We could add another function in:
result = users
.filter(x => x.hasOwnProperty('pets'))
.filter(x => x.hasOwnProperty('title'))
...
The repetition here is just clutter: it doesn’t add clarity, it’s just more to read and write. Compare with the same code using our has function:
result = users
.filter(has('pets'))
.filter(has('title'))
...
This is shorter and easier to write, and that makes for fewer typos. I consider this code to have greater clarity as well, as it’s easy to understand its purpose at a glance.
As for reuse, if you have to filter to pet users or people with job titles in many places, you can create function to do this and reuse them as needed:
const hasPets = has('pets');
const isEmployed = has('title');
const byAge = sortBy('age');
let workers = users.filter(isEmployed);
let petOwningWorkers = workers.filter(hasPets);
let workersByAge = workers.sort(byAge);
We can use some of our functions for single values as well, not just for filtering arrays:
let user = {name: 'Assata', age: 68, title: 'VP of Operations'};
if(isEmployed(user)){ // true
//do employee action
}
hasPets(user); // false
has('age')(user); //true
Let’s make a function that will produce a filter function that checks that an
object has a key with a certain value. Our has function checked for a key, but to check value as well our filter function will need to know two things (key and value), not just one. Let’s take a look at one approach:
//[p]roperty, [v]alue, [o]bject:
const is = p => v => o => o.hasOwnProperty(p) && o[p] == v;
// broken down:
// outer: p => [inner1 function, uses p]
// inner1: v => [inner2 function, uses p and v]
// inner2: o => o.hasOwnProperty(p) && o[p] = v;
So our new function called “is” does three things:
Here is an example of using this is to filter our users:
const titleIs = is('title');
// titleIs == v => o => o.hasOwnProperty('title') && o['title'] == v;
const isContractor = titleIs('Contractor');
// isContractor == o => o.hasOwnProperty('title') && o['title'] == 'Contractor';
let contractors = users.filter(isContractor);
let developers = users.filter(titleIs('Developer'));
let user = {name: 'Viola', age: 50, title: 'Actress', pets: ['Zak']};
isEmployed(user); // true
isContractor(user); // false
Scan this function, and note the time it takes you to figure out what’s going on:
const i = x => y => z => h(x)(y) && y[x] == z;
Now take a look at this same function, written slightly differently:
const is = prop => val => obj => has(prop)(obj) && obj[prop] == val;
There is a tendency when writing one line functions to be as terse as possible, at the expense of readability. Fight this urge! Short, meaningless names make for cute-looking, hard to understand functions. Do yourself and your fellow coders a favor and spend the extra few characters for meaningful variable and function names.
What if you want to sort by age in descending order rather than ascending? Or find out who’s not an employee? Do we have to write new utility functions sortByDesc and notHas? No we do not! We can wrap our functions, which return Booleans, with a function that inverts that boolean, true to false and vice versa:
//take args, pass them thru to function x, invert the result of x
const invert = x => (...args) => !x(...args);
const noPets = invert(hasPets);
let petlessUsersOldestFirst = users
.filter(noPets)
.sort(invert(sortBy('age')));
Functional programming has been gaining momentum throughout the programming world and ES6 arrow functions make it easier to use this style in JavaScript. If you haven’t encountered FP style code in JavaScript yet, it’s likely you will in the coming months. This means that even if you don’t love the style, it’s important to understand the basics of this style, some of which we’ve gone over here. Hopefully the concepts outlined in this post have helped prepare you for when you see this code in the wild, and maybe even inspired you to give this style a try!
I think you did a very good (difficult to do a great) job, particularly with your examples. Your post motivates me to dive deeper into FP.
- Brent Enright,
Thanks Brent!! Feedback like this makes my day!
Thanks. This gave me a bit to think about and mull over. It's not something that I have spent a lot of time on, but it's pretty important as I try to better grok the functional programming thing.
- Brian,
Nice!! FP is fun, I hope you keep it up!
- I know this is an older article...
- Maybe you have seen this rule/advice? https://eslint.org/docs/rules/no-prototype-builtins
- I use this little helper (and similar ones for other Object.prototype methods.)
type IPropertyKey = string | number | symbol; const h = Object.prototype.hasOwnProperty; export const hasOwnProperty = (obj, property: IPropertyKey): boolean => h.call(obj, property);Another reason to use Object.prototype.hasOwnProperty.call() https://github.com/jquery/jquery/issues/4665
More and more people are using
const a = Object.create(null); a.something = 'abc'. There is no prototype... so hasOwnProperty is undefined there. But you can use Object.prototype. hasOwnProperty.call(a, 'something');
- Darcy,
That is a good point and one I had not considered. Your comment should serve as ample warning to those considering copy/pasting from this post pell-mell, methinks. Thank you for pointing this out! Here's the JavaScript version for readers not versed in Typescript:
const hasOwnProperty = (obj, propname) =>
Object.prototype.hasOwnProperty.call(object, propname);
Darcy: why are "more and more people" using a = Object.create(null)? Is there some percieved performance benefit? Strikes me as a bit... well it would certainly be better if it weren't necessary.
To answer your question about why more and more people are using Object.create(null):
Object.create()is useful for some types of prototype composition. (ES6classprovides nicer syntax for many types of composition... but there is still use for Object.create() for special cases. mixins being one example.) One use case for Object.create(null) is for when you don't want OOTB methods from Object.prototype on your object. But I wouldn't do so without thinking about the consequences first. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/create has some discussion on it.A popular use case is when the object is being used as a hash map where keys are strings (and ES6 Map is not available). This is what JQuery was doing (it was used as a cache). By having no prototype, the cache is safe for keys like 'constructor', and other values on prototype chain. In some use cases the object may store user generated keys and someone could create a key like 'hasOwnProperty' that blocks the 'hasOwnProperty' method on prototype chain. In this case, if
cache.hasOwnProperty()is called, it would throw an error. Of course it really depends on the use cases of the hash map. If you know there won't be any key collisions, it does not matter.Performance is arguably better too because with a null prototype, there is one less prototype in the chain to resolve a key's value. And it could help prevent the need for even using
hasOwnPropertyin some cases. But I don't think performance is really the motivating reason forObject.create(null). JS engines are pretty fast (relative to other work) at resolving a value by looking up a key in prototype chain.
Thank you for the follow-up, Darcy! As I suspected, Object.create(null) is being used for "something weird" (to wit: trying to create an object that behaves like Map where Map is not available). Regarding performance, I will quibble slightly here and insist that unless one has actually tested and measured the performance impact, this is not work doing "for performance" (I know you're not strictly suggesting it but still).
.hasOwnProperty is not a function).Thank you for the suggestion and the follow-up!!
]]>github.io, I used Grunt. Besides building and distributing distributing bookmarklets, I am sure there are other reasons to build to github pages (or another branch on your repo), so I'm sharing my workflow here.
The following example assumes you are familiar with bookmarklets, Github Pages, and that you've used Grunt. We'll use a modified version of my bookmarklet build as an example.
So, we have our bookmarklet written, linted, minified, and committed to the master branch, and we're ready to publish it to the web. From a high level, this means checking out the gh-pages branch, rebasing onto master to get the latest javascript, interpolating the javascript file into the template, committing the new index.html file, and checking out master again to wrap up. Switching branches and rebasing are peculiar tasks for a build, but it can be done (even if it shouldn't be) and the following Gruntfile snippet explains how.
We need a minified javascript file:
(function(){var a="@",b="_",c="sequoia";alert(a+b+c);})();
And a template file to serve the bookmarklet:
<!-- filename: index.html.tpl -->
<html><body>
<a href='javascript:void(<%= marklet %>);'>Bookmarklet!</a>
</body></html>
abridged; see full version here
//we'll need `fs` to read the bookmarklet file
fs = require('fs');
module.exports = function(grunt) {
// Project configuration.
grunt.initConfig({
/* ... */
gitcheckout: {
//note that (non-"string") object keys cannot contain hyphens in javascript
ghPages : { options : { branch : 'gh-pages' } },
master : { options : { branch : 'master' } }
},
gitcommit: {
bookmarkletUpdate : {
//add <config:pkg.version> or something else here
//for a more meaningful commit message
options : { message : 'updating marklet' },
files : { src: ['index.html'] }
}
},
gitrebase: {
master : { options : { branch : 'master' } }
},
template : {
'bookmarkletPage' : {
options : {
data : function(){
return {
//the only "data" are the contents of the javascript file
marklet : fs.readFileSync('dist/afonigizer.min.js','ascii').trim()
};
}
},
files : {
'index.html' : ['index.html.tpl']
}
}
}
});
/* ... */
grunt.loadNpmTasks('grunt-git');
grunt.loadNpmTasks('grunt-template');
//git rebase will not work if there are uncommitted changes,
//so we check for this before getting started
grunt.registerTask('assertNoUncommittedChanges', function(){
var done = this.async();
grunt.util.spawn({
cmd: "git",
args: ["diff", "--quiet"]
}, function (err, result, code) {
if(code === 1){
grunt.fail.fatal('There are uncommitted changes. Commit or stash before continuing\n');
}
if(code <= 1){ err = null; } //codes 0 & 1 are expected, not errors
done(!err);
});
});
//this task is a wrapper around the gitcommit task which
//checks for updates before attempting to commit.
//Without this check, an attempt to commit with no changes will fail
//and exit the whole task. I didn't feel this state (no changes) should
//break the build process, so this wrapper task just warns & continues.
grunt.registerTask('commitIfChanged', function(){
var done = this.async();
grunt.util.spawn({
cmd: "git",
args: ["diff", "--quiet", //just exists with 1 or 0 (change, no change)
'--', grunt.config.data.gitcommit.bookmarkletUpdate.files.src]
}, function (err, result, code) {
//only attempt to commit if git diff picks something up
if(code === 1){
grunt.log.ok('committing new index.html...');
grunt.task.run('gitcommit:bookmarkletUpdate');
}else{
grunt.log.warn('no changes to index.html detected...');
}
if(code <= 1){ err = null; } //code 0,1 => no error
done(!err);
});
});
grunt.registerTask('bookmarklet', 'build the bookmarklet on the gh-pages branch',
[ 'assertNoUncommittedChanges', //exit if working directory's not clean
'gitcheckout:ghPages', //checkout gh-pages branch
'gitrebase:master', //rebase for new changes
'template:bookmarkletPage', //(whatever your desired gh-pages update is)
'commitIfChanged', //commit if changed, otherwise warn & continue
'gitcheckout:master' //finish on the master branch
]
);
/* ... */
};
That's it! 😊
Grunt tasks used here were grunt-template and grunt-git (the latter of which I contributed the rebase task to, for the purpose of this build).
Why use rebase?: We're using rebase here instead of merge because it keeps all the gh-pages changes at the tip of the gh-pages branch, which makes the changes on that branch linear and easy to read. The drawback is that it requires --force every time you push your gh-pages branch, but it allows you to easily roll back your gh-pages stuff (roll back to the last version of your index.html.tpl e.g.) and this branch is never shared or merged back into master, so it seems a worthwhile trade.
Is it realy a good idea to be switching branches, rebasing, etc. as part of an automated build? Probably not. :) But it's very useful in this case!
Please let me know if you found this post useful or if you have questions or feedback.
]]>I installed toilet on my (Ubuntu) system using the following command. Use your package manager or get the source.
sudo apt-get install toilet toilet-fonts
Toilet comes with a number of fonts by default, installed to /usr/share/figlet on my system. The following command, run from the directory containing the fonts, will create a file that contains the name of each available font followed by an example. Note that while the name of the font file with the extension is used in the following command, the extension is not necessary.
for font in *; do
echo "$font" && toilet Hello -f "$font";
done > ~/toilet_fonts.txt
Now you have a file with all the fonts, useful for reference.
$ head ~/toilet_fonts.txt
ascii12.tlf
mm mm mmmm mmmm
## ## ""## ""##
## ## m####m ## ## m####m
######## ##mmmm## ## ## ##" "##
## ## ##"""""" ## ## ## ##
## ## "##mmmm# ##mmm ##mmm "##mm##"
"" "" """"" """" """" """"
Toilet also comes with options to further transform or decorate your text, called "filters." The following command will output the name of each filter followed by an example, as above. This command outputs to the terminal rather than a file because the filters that add color may not come thru in the saved file
while read -r filt;
do echo "$filt";
toilet -f mono12 $USER -F "$filt";
done < <(toilet -F list | sed -n 's/\"\(.*\)\".*/\1/p')
I like border, flip, and left, but the best filter is of course "gay".
Mix and match fonts and filters to come up with a combination you like. Note that the filter switch can take a colon separated list of filters e.g.
toilet "07734" -F gay:180 -f smblock
Excepting the "metal" and "gay" filters, Toilet does not add colors to your text. This is as it should be as there are already utilities to add color to text in the terminal. I know what you're thinking: "but those terminal escape sequences are a nightmare!" and I was thinking the same thing 'til the fine folks of #bash set me straight. To wit, there is a tool called tput which handles color more gracefully than escape sequences. I encourage you to check out the examples of using tput to color terminal text. If you just want to get started, use tput setaf x and tput setab x to color your foreground and background, respectively, substituting x with a number 0-9 for different colors. See man tput and man terminfo ("Color Handling" section) for more.

So as for what you can actually do with toilet... that will be an excercise left to the reader. A friendly greeting in bashrc or a big red warning message are two uses that spring to mind. Drop me a line and let me know how you use it. Have fun!
]]>//Horizontal swipe
{1, 1, 1, 1, 1, 1, 1, 1, 1} ,
{3, 3, 3, 3, 3, 3, 3, 3, 3},
{7, 7, 7, 7, 7, 7, 7, 7, 7},
{15, 15, 15, 15, 15, 15, 15, 15, 15},
{31, 31, 31, 31, 31, 31, 31, 31, 31},
{63, 63, 63, 63, 63, 63, 63, 63, 63},
{127, 127, 127, 127, 127, 127, 127, 127, 127},
{255, 255, 255, 255, 255, 255, 255, 255, 255},
{511, 511, 511, 511, 511, 511, 511, 511, 511},
{1023, 1023, 1023, 1023, 1023, 1023, 1023, 1023, 1023},
{2047, 2047, 2047, 2047, 2047, 2047, 2047, 2047, 2047},
{4095, 4095, 4095, 4095, 4095, 4095, 4095, 4095, 4095},
{8191, 8191, 8191, 8191, 8191, 8191, 8191, 8191, 8191},
{16383, 16383, 16383, 16383, 16383, 16383, 16383, 16383, 16383},
{16382, 16382, 16382, 16382, 16382, 16382, 16382, 16382, 16382},
{16380, 16380, 16380, 16380, 16380, 16380, 16380, 16380, 16380},
{16376, 16376, 16376, 16376, 16376, 16376, 16376, 16376, 16376},
{16368, 16368, 16368, 16368, 16368, 16368, 16368, 16368, 16368},
{16352, 16352, 16352, 16352, 16352, 16352, 16352, 16352, 16352},
{16320, 16320, 16320, 16320, 16320, 16320, 16320, 16320, 16320},
{16256, 16256, 16256, 16256, 16256, 16256, 16256, 16256, 16256},
{16128, 16128, 16128, 16128, 16128, 16128, 16128, 16128, 16128},
{15872, 15872, 15872, 15872, 15872, 15872, 15872, 15872, 15872},
{15360, 15360, 15360, 15360, 15360, 15360, 15360, 15360, 15360},
{14336, 14336, 14336, 14336, 14336, 14336, 14336, 14336, 14336},
{12288, 12288, 12288, 12288, 12288, 12288, 12288, 12288, 12288},
{8192, 8192, 8192, 8192, 8192, 8192, 8192, 8192, 8192},
{0, 0, 0, 0, 0, 0, 0, 0, 0},
{18000}
What was all this?! The numbers go up and down and somehow this turns the lights on and off. I had to learn more. I looked into it a bit, it turns out its actually not that complicated: there are 9 rows of 14 lights, so each number represents one row, each array of 9 numbers represents one shield state. Each row of lights on the shield is just a binary number with the least significant bit on the left! So to turn the first light on, flip the first bit (1), to turn the last light on, flip the 14th bit (8192) etc. (if the dec->bin conversion isn't clear, read here). Sequencing out these animations manually, row by row, light by light, was obviously impractical, so I set out building a tool to make it easier. I also wanted an excuse to use bitwise operators, which I never have occasion to use at work. :p
Please take a break now to look at the LoL Shield Sequencer, if you haven't already.
Well I didn't get the diamonds animations done ready in time for HOPE, but I did get the sequencer working. I switched tack and decided to make the browser tool drive the physical shield directly, rather than requiring one to cut & paste into the sketch and load it onto the Arduino. This required transmitting the shield states to the Arduino via the USB port. I was following this tutorial which told how to read a file with Processing and transmit the data to the Arduino with the serial library. So I need to write to the file, which obviously the browser can't do. My steps are now
That's a lot of steps! On a lark, I tried writing some text directly to the USB device (echo "1" > /dev/ttyUSB0) and what's this? It worked! It turns out the Linux kernel writes to the USB port at 9600 baud by default (I'm not sure exactly where this default is set but it is set by U-Boot; see man termios for info on changing it). So I can cut step 3 and write directly from PHP to the USB port. The initial PHP script, in its entirety:
<?php
file_put_contents(/dev/ttyUSB0 , $_POST['frame']);
?>
Much simpler! I let people mess around with it and they did! I was very happy.
What heartwarming hacker-con moments! Later I added a localStorage component so people could make little animations & save them, then see what others had done. I'll have that up and running next con for sure.