Adam C. Conrad https://www.adamconrad.dev The website and home of engineering leader Adam C. Conrad. Tue, 29 Mar 2022 14:01:00 +0000 Behavioral interviews https://www.adamconrad.dev/blog/behavioral-interviews/ Tue, 29 Mar 2022 14:01:00 +0000 https://www.adamconrad.dev/blog/behavioral-interviews/ We had to fix a showstopper bug on our account settings billing page. We had to triage the bug and coordinate with the developers to find the root cause the problem and fix it. Our on-call developer quickly resolved the bug and restored access to this crucial billing page. We estimate it saved the company $100k in lost revenue by implementing these reactive practices. Why is this a poor answer? It follows the STAR method, it's short, and recounts an actual success you experienced in a previous role. **The problem is it says nothing about what _you specifically did_ in this situation.** Were you the team lead? The engineering manager? What part of the triage process did you participate in? Did you perform the triage? Did you implement the triage process documentation for the team? Did you hold a post mortem or root cause analysis to learn from this mistake? Did you run the report to quantify the impact? These questions will swirl in the interviewer's mind. They will have to spend several minutes picking apart your answer to discover your contribution. This will take away from the other questions they wanted to ask without capturing the signal they were looking for. **Behavioral interviews are where it is okay to talk about yourself and how you succeeded in the past.** Do not be afraid to use the words "I," "me," or "mine." In the end, only _you_ are changing jobs, not your whole team. ]]> 2022-03-29T14:01:00+00:00 Default to gratitude https://www.adamconrad.dev/blog/default-to-gratitude/ Wed, 23 Mar 2022 23:11:00 +0000 https://www.adamconrad.dev/blog/default-to-gratitude/ 2022-03-23T23:11:00+00:00 Changing jobs during the Great Resignation https://www.adamconrad.dev/blog/changing-jobs-during-the-great-resignation/ Fri, 18 Mar 2022 20:53:00 +0000 https://www.adamconrad.dev/blog/changing-jobs-during-the-great-resignation/ 2022-03-18T20:53:00+00:00 Consistency https://www.adamconrad.dev/blog/consistency/ Wed, 18 Aug 2021 23:15:00 +0000 https://www.adamconrad.dev/blog/consistency/ This article is part of the [Systems Design Series](/tag/systems-design/) following the book [_Designing Data-Intensive Applications_](https://dataintensive.net/) by Martin Kleppmann. If you missed the [previous article](/blog/fault-tolerance/), check that out first. This chapter, and the last one we will cover in this book series, covers consistency and consensus. It focuses on the CAP theorem. We will explore all levels of consistency guarantees. ## Consistency guarantee levels There are three grades of consistency guarantees: * **Weak:** Reads may or may not see writes. Often seen in real-time (VOIP, chat) or in-memory solutions like Memcached. * **Eventual:** Reads will _eventually_ converge to see the latest write. Data replication is asynchronous. BASE NoSQL databases use this model most often. **Highly available systems** like email use eventual consistency. When convergence occurs is arbitrary and one of the more tenuous aspects of this consistency guarantee. * **Strong:** Reads always see writes instantly. Data replication is synchronous. ACID relational databases use this model. * **Causality:** Something happens after another. Questions precede answers. Causal ordering is often a comparison of a limited set of operations rather than the total ordering of all operations in a system. **This is the practical limit of strict consistency in distributed systems**. Timestamps in a replication log are an easy way to achieve a causally consistent dataset. * **Linearizability:** The strictest definition of consistency as it relates to the CAP theorem. Applications do not worry about replicas and maintain the illusion of a copy of the data. It's a recency guarantee that states there is an atomic operation that happens in the database where the data can never revert to the old value after a particular point in time. It is said to evolve data _linearly_. Apache ZooKeeper uses linearizability for leader election to ensure you cannot promote two nodes to be the same leader in [single leader replication](/blog/replication/) schemes. One final note on linearizability: **single leader replication is your only practical strategy for achieving a highly available system with strong guarantees**. Other replication strategies, such as multi-leader and leaderless replication, utilize asynchronous replication. Because strong consistency requires synchronous replication, we can only rely on single leader replication where the leader handles both reads and writes, and the replicas simply act as fault-tolerant backups. This limits our use cases for replicated, linearizable systems. Since multi-leader and leaderless replication are out, you cannot distribute your data across multiple data centers. This is because each data center needs to have its own leader. In theory, you could have multiple data centers that all have to access a central data center where your single leader lives, but if any data centers fail you risk halting all operations. This could crush performance and grind requests to a halt. The book covers quite a bit more on ordering and linearizability than I would have liked because Kleppmann admits that linearizable consistency is not practical in real-world distributed systems. And yet, there is quite a bit he covers on things like [total order broadcast](https://en.wikipedia.org/wiki/Atomic_broadcast) for such systems that we will likely never need to use. ## On consensus To have a consistent view of the data across distributed systems, all of those nodes have to agree on the state of the data. This is known as _consensus_. One classical way of achieving consensus is through something called a _two-phase commit_. It's basically like marriage vows: a coordinator (officiant) asks each participant if they are willing to commit to the transaction (marriage). That is phase one, the _preparatory phase_. If every participant says yes ("I do"), then the _commit phase_ executes and the transaction has to complete. These kinds of algorithms are often implemented in relational databases for distributed systems. You don't see this covered much in NoSQL solutions. And while there are benefits to this system to increase atomicity and durability, there is a great risk for performance. **If the coordinator goes down and the log record of the 2PC is lost, you could potentially lock the database forever**. As we've seen before, we never want a single point of failure. the two-phase commit algorithm rests heavily on the health of the coordinator application to ensure a smooth transaction across a distributed system. How might we mitigate this? Better consensus algorithms have emerged over the years. The two most commonly-referenced algorithms, both in this book and externally, are [Paxos and Raft](https://blockonomi.com/paxos-raft-consensus-protocols/). They are total order broadcast algorithms so they deliver messages exactly once, in order, to all nodes. In other words, this is a **repeated barrage of consensus decisions for each node to decide on what the current state of the data is**. A common use case for these algorithms is with [leader election](/blog/replication/). You need to gather a majority vote amongst nodes as to who is the new leader when the old leader dies or goes offline. While this helps ensure that two nodes can't claim to be the leader at once, they do slow down the application significantly. Consensus protocols require synchronous replication of votes which is the slowest way to provide consistency of decisions. Further, they require a majority voting structure. This means you can't have an even number of nodes since you could have a tie. This also means you can't dynamically add or remove nodes into your cluster. This is because the number of nodes you have at any given moment will have to flip from odd to even. In the end, you have a very rigid and brittle system for consensus that does not support the dynamic nature of modern distributed systems. ### ZooKeeper to the rescue Can you achieve a fault-tolerant total order broadcast with high performance? [Apache ZooKeeper](https://zookeeper.apache.org/) aims to do just that. Google Chubby, Etcd, and [even Redis](https://en.wikipedia.org/wiki/Distributed_lock_manager#Other_implementations) can act as a distributed lock manager. All of these systems act as in-memory databases that hold a small amount of information about all nodes. Think of them as an index card of emergency numbers that are distributed to every node. These coordinator services ensure that a tiny amount of information is embedded on every node so that if a decision needs to be made quickly, consensus can be reached without a lot of heavy back-and-forths over the wire. In fact, tools like ZooKeeper will purposefully make small decisions across a tiny number of nodes to ensure the voting is fast. If a leader dies, for example, it will specifically only target three to five other nodes for a vote, rather than rely on quorum from the entire cluster. The other benefit of a tool like ZooKeeper is it provides a method for _service discovery_ in partitioned systems as well. It acts sort of like a load balancer in that it can route requests to the correct node or cluster to reach a particular service. In doing so, it can also act as a _heartbeat service_ to check for the membership status of the nodes in its jurisdiction. If there's anything I've learned over the last several weeks, **ZooKeeper is the swiss army knife of distributed system coordination**. If you're looking for a tool that handles consensus, failure detection, membership, and fault tolerance, this should be the first one for you to grab in your toolchain. ZooKeeper is not a silver bullet. **Multi-leader and leaderless replication don't require global consensus, and thus don't need ZooKeeper.** As we discussed in a previous chapter, these replication schemes are the most common in distributed systems. These are also the most common schemes to use NoSQL databases. Thus, you won't see ZooKeeper being used outside of distributed RDBMSs. ## Closing thoughts This wraps up [my series](/tag/systems-design/) on distributed systems following this book. There are quite a few more chapters that are not covered here. I'd encourage you to check out more resources if you're interested: * [Grokking the system design interview](https://www.educative.io/courses/grokking-the-system-design-interview) is a way to practice the things learned in this series * [Grokking the advanced system design interview](https://www.educative.io/courses/grokking-adv-system-design-intvw) offers even more practice and more closely resembles the first principles tools used throughout this series * [This system design primer](https://github.com/donnemartin/system-design-primer) includes flashcards to provide spaced repetition learning of the concepts we've discussed * [This channel](https://www.youtube.com/channel/UC9vLsnF6QPYuH51njmIooCQ) and [this series](https://www.youtube.com/watch?v=xpDnVSmNFX0&list=PLMCXHnjXnTnvo6alSjVkgxV-VH6EPyvoX) offer an excellent selection of videos on distributed systems design * [High scalability](https://highscalability.com) is the gold standard for cutting-edge news and research in practical distributed systems. Consider subscribing! ]]> 2021-08-18T23:15:00+00:00 Fault tolerance https://www.adamconrad.dev/blog/fault-tolerance/ Wed, 04 Aug 2021 00:33:00 +0000 https://www.adamconrad.dev/blog/fault-tolerance/ This article is part of the [Systems Design Series](/tag/systems-design/) following the book [_Designing Data-Intensive Applications_](https://dataintensive.net/) by Martin Kleppmann. If you missed the [previous article](/blog/partitioning/), check that out first. The self-proclaimed pessimistic chapter of the book, chapter 8 covers fault tolerance in distributed systems. Let us briefly introduce why things break in an internet-scale system: * **Commodity machines break.** Since internet applications are about [scaling out instead of up](/blog/replication/), each node in the cluster is cheap. Cheap things break more frequently. * **The Internet Protocol is unreliable.** There is no guarantee that things will move as intended. Everything is sent over the wire in _packets_. Traditional telephony networks send everything over in _circuits_. This lack of a reliable, consistent connection presents problems. The request could be _lost_ or _queued_. The node accepting the request could _stop responding_ or _fail_ altogether. Worse, your request could have made it all the way to the node and back but was _lost on the way back to you_ or will arrive _later than you want it to_. * **We add in TCP to make it more reliable.** The downside is there are variable network delays since that queuing and reliability slows the system down. * **You can trade TCP for UDP if you need more speed.** If you cannot afford delays in data and can afford less reliability, consider using UDP. A Zoom call is a good example of when UDP is a better protocol than TCP. This is in contrast to vertically-scaled systems that rely on beefy supercomputers. These systems often act as if they were single-computer applications. Therefore, we will not be covering this in the book or in the notes. The crucial thing to internalize is that **faults happen in the real world all of the time**. The best anecdote from the book is how a shark bit some network cables and took down parts of Amazon EC2. Even a company as powerful as Amazon cannot prevent the almighty shark from screwing up network connectivity. ## How to deal with faults If we know faults will happen, then it is vital to be proactive in dealing with them. **Error handling** is a simple and effective way of staying on top of faults. Keeping your customers aware of issues may only require a browser refresh to correct the problem. **Heartbeats** help detect faulty or down nodes. A _load balancer_ (like HAProxy or Nginx) will periodically ping nodes to ensure they are up to send them traffic. **Leader election** is a strategy to elect new write replicas when leaders are down in replicated clusters. **Timeouts are the gate to try all of the above strategies.** For customers, the only way to reliably detect faults is to inform them that too much time has elapsed to process their request. In that time, you will have tried the things above. Once the timeout has been exceeded, it is safe to assume that a fault has occurred. How do you choose a sensible timeout? **Measure your response times and variability.** Then, choose a number. Unfortunately, **there is no standard timeout interval that will work for everyone.** Apache Cassandra uses the [Phi failure-accrual algorithm](https://www.slideshare.net/srisatish/cache-concurrency-cassandrajvmapachecon2010) to adjust gossip protocols when they check with nodes to determine if they are down. TCP does its own [retransmission](http://blough.ece.gatech.edu/4110/TCPTimers.pdf), but there is not much here for you to take away for your own applications. ### Timing [Timezones are hard](https://www.freecodecamp.org/news/synchronize-your-software-with-international-customers/). Dealing with time, in general, is difficult. All computers have clocks, but they are rarely in sync with each other. How do you ensure that the time or duration of an activity is accurate? **Sync them.** Things called NTP servers help ensure that the clock you see on your computer is actually correct. If it isn't, the NTP server will send back the correct time and update your clock. The problem is, your clock could hop backward. If you sent a request at `10:00:01` and it was returned at `10:00:00`, you have a problem. That is why these kinds of clocks, called _time-of-day clocks_, are not reliable for measuring time duration. **Count monotonically.** To safeguard against a standard clock, a _monotonic clock_ is also available on a computer. It is basically a giant timestamp counter that always counts upward, regardless of time syncing issues. You can still use an NTP server to adjust the monotonic clock, but it does not count backward. If there is a syncing issue, you simply _delay when the next value is incremented_. This is generally the preferred method for distributed systems. Even with both mitigation strategies in place, **no system is perfect**. There is a network delay from the NTP server to your local clock, so they will never fully be in sync. Firewalls, leap seconds, virtual machines, and physical hardware clock malfunctions all contribute to perpetual inaccuracies between the actual time and the time posted on your machine. As long as you accept this premise, you can achieve reasonably high accuracy with the times you use it. ### Strategies around timing issues With all of these issues around keeping track of time, there are really only 3 solutions that are worth implementing: * **The last write wins**. Cassandra and Riak use this strategy to determine the most recent write to a database. But this suffers from the same basic problem: _what is recent_? If it is all relative anyway then you could still have two nodes that wrote to the DB with the same timestamp but one node's clock is off. * **Provide times with a delta**. Google Spanner uses this with its [TrueTime API](https://cloud.google.com/spanner/docs/true-time-external-consistency), but good luck using it outside of Google. * **Limit certain functions that would cause time delays and pauses**. Garbage collection is a notably slow process that can hang a system. Ensuring there is enough memory to handle other functions when garbage collection grows unwieldy ensures some fault tolerance. The message you must remember is that **time is relative, and you cannot achieve _real-time_**. Real-time, in the truest sense of the word, is reserved for embedded systems so things like your airbag can deploy when it needs to. Real-time, as it is used for web-scale, is a relative term for feeling instant even if it isn't actually an instant operation. ## Trusting a faulty system Now that we know systems are unreliable, we need to know that the actors can be trusted. The book assumes they can be and that each player in the distributed system acts with honest intention. This is where the old security saying of **trust but verify** comes into play. _Quorums_, as we discussed in the [data replication chapter](/blog/replication/), are a way of obtaining votes from all nodes to figure out what the majority believes about the system during a discrepancy. If you have 3 children and ask, “who stole the cookies from the cookie jar,” you're likely to believe it is child 2 if both 1 and 3 point to 2. What if one of them is lying? This is the basis for the [Byzantine Generals' problem](https://www.microsoft.com/en-us/research/uploads/prod/2016/12/The-Byzantine-Generals-Problem.pdf). Systems that safeguard against lying actors are _Byzantine fault-tolerant_. They are only needed for mission-critical systems like aerospace or the blockchain. Internet web-scale systems do not require Byzantine fault-tolerance (usually in the form of 2/3rd majority vote rather than a simple majority vote), and we can improve fault tolerance with a few simple tricks: 1. **Require checksums.** TCP and UDP offer these and ensure that corrupted packets must be retried to ensure data is transferred correctly. 2. **Permission, sanitize, validate, error-handle, and output-escape your data.** This [blog post](https://alexkrupp.typepad.com/sensemaking/2021/06/django-for-startup-founders-a-better-software-architecture-for-saas-startups-and-consumer-apps.html) beautifully summarizes a recipe to eliminate bugs from your code when ingesting data. Bugs are a source of lying since the system behaves unexpectedly through faulty code. 3. **Add redundancy to your NTP servers.** Replication of time synchronization server checks ensures that the majority timestamp is the accurate and correct one. 4. **Build systems around crash recovery and partially synchronous system models.** System models take many forms and shapes. One will not see the strictness of a completely synchronous, Byzantine fault-tolerant system in the real world. But you also don't want a fully asynchronous model where even the slightest perturbance shuts down the system. Striking a middle ground with your system model is a sensible and useful real-world approach. The next chapter will investigate algorithmic approaches to handling real-world fault tolerance with system models that handle crash recovery and strive for partially synchronous updates. This quote at the end of the chapter summarizes how I would recommend the litany of strategies in this book to a more junior engineer: > If you can avoid opening Pandora’s box and simply keep things on a single machine, it is generally worth doing so. It is easy to look at a book like this and think of all the neat technologies you could use on your current project. Resisting the urge to do so and finding ways to say no is arguably a better approach. Most companies and most systems will never require the scale beyond a simple Postgres server. Barring safeguards around data replication, a simple RDBMS can scale to millions and millions of records without much else to power it. Until we meet again in the next chapter, here is some additional reading material I enjoyed with this week's chapter: * [A system design question around the "top-k" heavy hitters](https://www.youtube.com/watch?v=iJLL-KPqBpM). * If you didn't watch any of the previous episodes from this channel, you'll also want to study their implementation of a [distributed messaging queue](https://www.youtube.com/watch?v=iJLL-KPqBpM). * [How to beat the CAP theorem](http://nathanmarz.com/blog/how-to-beat-the-cap-theorem.html) is the inspiration for the system design in the first link. * [This article on the Kappa architecture](https://www.oreilly.com/radar/questioning-the-lambda-architecture/) challenges the merits of the Lambda architecture with a simpler model that uses fewer frameworks and drops batch processing as a requirement. * [Comparing modern stream processing frameworks](https://www.youtube.com/watch?v=ZWez6hOpirY) is a great video that helped me make sense of way too many Apache streaming systems and why they all seemed to be redundant upon first glance. ]]> 2021-08-04T00:33:00+00:00 Partitioning https://www.adamconrad.dev/blog/partitioning/ Wed, 28 Jul 2021 09:00:00 +0000 https://www.adamconrad.dev/blog/partitioning/ This article is part of the [Systems Design Series](/tag/systems-design/) following the book [_Designing Data-Intensive Applications_](https://dataintensive.net/) by Martin Kleppmann. If you missed the [previous article](/blog/replication/), check that out first. This chapter is another central tenant of distributed systems: partitioning. Also known as sharding, we will explore how to split up large data sets into logical chunks. ## Strategies for partitioning If _replication_ is about copying all data to different databases and datacenters, then _partitioning_ is about slicing up that data across those databases and datacenters. There are a few ways you can split up that data: * **By key.** The key can split the rows of data. Imagine 26,000 names evenly split alphabetically. You could put 1,000 entries in a database for the _A_ user names, 1,000 more for the _B_ names, and so on. * **By hashed key.** You could hash the same key and then split up the data that way as well. The easiest way to ensure information is evenly split is to use a modulo (`%`) operator (hint: just because it is easy doesn't mean it is helpful). If you run the modulo over the number of datacenters, you will always have a key mapped into a datacenter ID. * **Add randomness for active keys.** The above assumes that all keys are read and written in an even distribution. For the ones that are not, you can break up keys further with prefix or suffix IDs. Now the same key can be segmented across shards. * **By a secondary key.** Like the prefix/suffix keys, you can use an additional key with a secondary index. This will make searching faster for the data that is partitioned. * **For local document search.** You could make this secondary index the key with the value being the original primary key that you added into the shard in the first place. * **For global document search.** Also called a _term index_, you could segment the secondary keys and grab all of the keys across all shards (and shard the term). This is how full-text search databases work. ## Strategies for rebalancing Over time, these partitioning strategies will skew unevenly. The easiest way to visualize this is to imagine the old hardcover encyclopedias. They are not 26 volumes split by letter. Instead, letters likes `Q` and `X` are combined while `A` and `S` get their own book. While you may have an even balance of data to start, it will not necessarily evolve that way. You will need methods to rebalance your data from time to time. Remember how I suggested that we use the modulo operator to segment our keys against the datacenters? This works until you add more datacenters. Then your modulo value needs to change, and you need to rebalance _all_ of your data which creates a lot of rewriting as your data scales. This is not ideal. Here are a few better ways: * **With a fixed number of partitions.** Rather than use the physical shards as segments, create arbitrary partitions within each shard node. That way, as your shards grow or shrink, the number of partitions stays the same. Then, you simply adjust how many partitions you allocate to each node. _Riak and ElasticSearch use this method._ * **With a dynamic number of partitions.** The previous example is sensible but suffers from the same problem as earlier that you have data evolve in very skewed portions. If your partitioning segments are chosen poorly, your data will be lopsided. Dynamic rebalancing is using something like a B-Tree to change the structure as partitions are added and removed. The data evolves, and so too do the partitions. _HBase and MongoDB use this method._ * **With a proportional number of partitions to nodes.** This is somewhat of a blended approach. The number of partitions is not strictly fixed. It changes as the number of nodes grows and shrinks. This is not the same as dynamic rebalancing because the dynamism reflects solely on the number of nodes and not on arbitrary boundaries that you define. _Cassandra uses this method._ These strategies all depend on the configurations you make as a developer. You can choose how you want to deploy this rebalancing: _manually_ or _automatically_. With manual rebalancing, a developer assigns partitions to nodes. This requires more effort and time, but it allows you to respond to the needs of your data. Conversely, automatic rebalancing enables the software to do this work for you. Tools like Couchbase and Riak suggest this approach (with approval), but the downside is the unpredictable and possibly suboptimal choices for segmenting and rebalancing your data. A wrong move by the software could be slow and costly. ## How to find data With all of this data split up amongst so many physical computers, how do you go about finding them? * **You can keep track of all of them on the client.** The obvious solution is to keep track of each shard in the same way that you have to have your encyclopedias face out on your bookshelf. You have to see which letter you want to access when you approach your bookshelf. The downside is that this is tedious and must be reconfigured every time you add or remove nodes. * **You can have your request routed for you.** A middle layer can interpret your query or mutation and figure out where your data is for you. This is like in the old days of telephones when operators connected you to the person on the other end. You did not need to know their phone number. You just needed to remember the number of the operator so they could route the call for you. * **You can throw a request to any node, and that node will route it.** This is like calling up a corporation and going through their touchtone service. You know you want to call up Dell customer support, so all calls get routed to a 1-800 number, but it is up to the service to find the exact extension for you. The default number _might_ get you who you are looking for if you know your party's extension number, but you may have to navigate the tree of options. Tools like [Apache ZooKeeper](https://zookeeper.apache.org/) handle this configuration management in many popular databases like HBase and Kafka. Engines like Riak and Cassandra use a [gossip protocol](https://en.wikipedia.org/wiki/Gossip_protocol) to chain requests from node to node until the data is found. --- Partitioning is straightforward and not nearly as dense as [replication](/blog/replication/). There are only so many ways you can split up data. In the [first post of this series](/blog/scalability-reliability-maintainability/), I noted I would be skipping the next chapter on transactions. The last sentence of the chapter explains why: > In this chapter, we explored ideas and algorithms mostly in the context of a database running on a single machine. Transactions in distributed databases open a new set of difficult challenges, which we’ll discuss in the next two chapters. Since I care about educating this audience on distributed systems and systems design, it seems only fair to focus on chapters that tackle transactions in a distributed nature. ]]> 2021-07-28T09:00:00+00:00 Replication https://www.adamconrad.dev/blog/replication/ Tue, 20 Jul 2021 23:30:00 +0000 https://www.adamconrad.dev/blog/replication/ This article is part of the [Systems Design Series](/tag/systems-design/) following the book [_Designing Data-Intensive Applications_](https://dataintensive.net/) by Martin Kleppmann. If you missed the [previous article](/blog/storage-and-retrieval/), check that out first. This chapter is the real introduction to distributed systems. We are shifting into the second part of the book here. You will notice that I intentionally skipped chapter 4 based on a suggested curriculum I outlined in the [first post of this series](/blog/scalability-reliability-maintainability/). ## Horizontal versus vertical scaling Before the chapter gets started, Kleppmann briefly discusses how we've collectively arrived at a _shared-nothing architecture_ of horizontal scaling. Distributed systems design focuses on scalability so it is important to know the 3 main types of scaling you'll see in real-world systems: 1. **Vertical scaling.** Also known as _shared-memory architectures_ or _scaling up_, this involves adding more RAM, CPU, and resources to a single machine. The key observation is that **doubling resources will likely more than double your costs**. As such, vertical scaling is recommended when heavy computation is required at the expense of fault tolerance and reliability. There is a variant of this called _shared-disk architecture_ which pertains mostly to data warehouse applications that provide additional fault tolerance with multiple machines accessing the same disks on a centralized network. This still suffers from the same issues when you colocate all data in one data center but has limited use. 2. **Horizontal scaling.** Also known as _shared-nothing architectures_ or _scaling out_, this refers to adding more machines with commodity resources. This is the primary strategy used when companies scale out their services and hence is the primary focus of this section of the book. ## Replication 101 Why do we replicate? 1. **Reduce latency.** Keeping data close to your users reduces the time it takes to physically travel over the wire to your customers. 2. **Increase availability.** The server goes down? You've got a backup on a different machine and in a separate data center. 3. **Increase throughput.** Drive-thru windows at your bank are convenient but if the line is long you know there are lots of bank tellers available inside to help you out and keep customers and productive. This chapter discusses three primary reasons _why_ we need to replicate. We'll offer strategies to resolve why **data changes all of the time and how we deal with it**. If data never changed then replication would be easy. Since data grows and evolves over time, we employ various strategies based on our system's specific needs. ### Leader replication The first two methods involve **leader replication**. Each replica, or copy, has a role: `leader` or `follower`. Some schemes invoke a single leader while others leverage multiple leaders. The responsibilities of each are straightforward: 1. **Leaders ingest write data.** Only the leader can modify the database. 2. **Followers copy changes from the leader.** Followers must wait for a log file from the leader with changes they will make in the same order the leader made them. 3. **Anyone can offer read data.** All replicas can return data to a customer that asks for it. This is called _read-scaling architecture_. That last step is the interesting one. What if you ask for data from a replica that has not copied all of the latest changes yet? This is where replication strategies offer tradeoffs in their approach. From here, you can focus on two ways of getting changes to replicas: 1. **Synchronous replication.** As the name implies, you send changes to followers immediately. You then only inform customers when all changes are completed successfully. Practically, this doesn't work because if a follower replica hangs then the change fails which hurts durability. To be fair, **synchronous replication is the best strategy to eliminate the risk of data loss**. In reality, when systems use synchronous replication, they will follow a hybrid _semi-synchronous replication_ strategy where one follower is synchronous while the others are asynchronous. Read-scaling architectures don't work with this scheme because followers have to wait for leaders to synchronize data which keeps them too busy to accept additional reads. 2. **Asynchronous replication.** In this scheme, changes are sent to followers immediately. The difference is you can inform customers when _some_ changes complete successfully. You are not beholden to every follower sending a successful response. This is the most common scheme in production, high-scale application. This trades off speed for no guarantees in durability and eventual consistency. **What if you want to add a new follower to the gang?** 1. **Periodically take snapshots of the data.** This is good anyway for backups. Send the latest backup to your new follower. 2. **Poll the leader again for changes since the snapshot.** Now you can assume the new follower is asking for changes as if this were an asynchronous replication strategy. Each database engine has its own scheme for issuing catch-ups such as log sequence numbers or binlog coordinates. **What if things fail?** * **If the follower fails** then you execute the same catchup strategy for a new follower. Just read from the logs the last changes since the failure and then play catchup. * **If the leader fails** then you have to elect a new leader. You can either do this _manually_ or _automatically_. Manual promotion benefits from careful failover to a verified follower. The downside is that you have to invest developer time into failovers and can be cumbersome if they occur with some frequency. Automatic promotion relies on selecting a new leader, usually with the closest representation to the leader's data. Reaching consensus with the automatic strategy is fraught with problems: * **There might be a gap in the data, even with the best follower chosen.** In this case, you violate strict durability and you incur data loss which is never good. * **Multiple followers might believe they are the leader.** This can also lead to data loss if both start accepting writes to propagate to the other followers. * **The heartbeat interval may be too long and miss the latest changes.** Even if you have one follower with a perfect copy of the leader's data, the leader may have been down long enough to miss some new writes. If the heartbeat to check for its health is in a long enough interval then new changes could have been requested while the leader was down which will be lost for good. * **The old leader may not relinquish their throne**. Assuming everything else was correct, you still might see a variant on the second issue where the old leader starts back up and still thinks it is the leader. Now you are in a similar scenario where two leaders are consuming writes and you incur data loss. Logs help with restoring the data in addition to helping new followers join in on the fun. Each database engine utilizes one of four replication log schemes: 1. **Statement-based.** These copy raw SQL commands. They are simple but require deterministic commands, which can be too strict for most databases. 2. **Write-ahead.** These are logs written before changes to the main disk are stored. This is a common append-only log that is used with things like [SSTables and B-trees](/blog/storage-and-retrieval/) but requires all engines to follow the same format. If the format changes with breaking changes to an engine, then it will require downtime to upgrade. This is what plagues things like major version upgrades in tools like Postgres. 3. **Row-based.** This focuses on the values of rows instead of the commands like the previous two. This decouples the data from the storage engine used, which solves the problems from the previous strategy. New versions of Postgres use this, also known as _logical log replication_. The book doesn't seem to see a downside to this approach but, [this Stack Overflow thread](https://stackoverflow.com/questions/33621906/difference-between-stream-replication-and-logical-replication) goes into this strategy in greater detail which I appreciated. In short, things take more time because one database is updated at a time methodically. 4. **Trigger-based.** This strategy leverages SQL triggers and stored procedures. While this benefits from functionality native to the language instead of the database engine, these triggers can be complex. Further, this log strategy does not work for NoSQL solutions or any database that doesn't leverage SQL, such as graph databases. #### Eventual consistency As mentioned earlier, a read-scaling architecture can provide additional performance for applications that are heavy on reads. The downside of this strategy is that you may read outdated information from a follower node. When a leader node sends data asynchronously to a follower node, it cannot wait to know if the follower replication succeeded. This is what is known as _eventual consistency_. At some point, _eventually_, the data will be synced across all replicas. There is no guarantee when that eventual time will be. But, it is meant to be a reasonably short amount of time so that there is no real compromise in reliability. Nonetheless, there are occasional complications with eventual consistency: * **Reading right after writing.** Say you update your email. If you make those changes and refresh the page then you'll want to see your new email address saved. _Read-after-write consistency_ is a strategy around this but is complicated in distributed systems. * **Reading into the past.** Say you write a few checks to your bank account. Your bank account starts at $5, then you add two checks of $5 each. In theory, if the eventual consistency is delayed enough between replicas, the leader could show $15, one replica could show $10, and yet another replica could show the original $5. _Monotonic reads_ ensure you only read from one replica so this problem doesn't occur. Of course, if the replica fails, you're back to square one. * **Answers to questions you haven't asked.** In the previous scenario, you could see a later state before you see a previous state. Not only are you going into the past but you are messing up the causality of the data - you wouldn't expect an addition to lead to subtraction of the balance. _Consistent prefix reads_ to ensure that the writes happen in a certain order. If you have a sharded database, you can't guarantee this fix unless all of the data is affixed to the same shard. This becomes more complicated but does solve the issue. All of these problems are difficult to solve with a single-leader replication strategy. They become less difficult when you invoke a multi-leader replication strategy. #### Multi-leader replication As the name implies, multi-leader replication allows you to assign multiple leaders at once. This solves nearly all of the problems you see above: * **No need for manual promotion.** With multiple leaders, you don't have to worry about being without a leader at all. Restarting leaders assumes there is at least one leader already able to receive writes. * **No gap in writes.** Unless you get incredibly unlucky, a leader will exist to ensure there is no downtime of a leader to accept writes. This can happen with a single leader. With multiple leaders, you guarantee there is always a place to accept writes. * **No leader conflict resolution.** If you know who the leaders are, you can always create a backup scheme so you either fall back to an alternative leader _or_ you have a simple election to keep things smooth. * **Heartbeat intervals no longer matter (to a point).** Waiting a long time to see if the leader is still online is not a problem because you have multiple leaders to fall back on. As long as their heartbeats aren't synced and are checked in a staggered fashion, you can ensure that you're constantly aware of at least one online leader. * **Multiple leaders don't necessarily require follower promotion at all.** One problem with a single-leader scheme is the leader won't give up its role as leader when it comes back online. If you have other leaders available, there is no need to promote them. You always have leaders available to do all the work you need. If a particular leader goes offline, you just wait for it to return and push the additional load to your remaining leaders. In addition to solving problems from single-leader replication, there are a few use cases where multi-leader replication really shines: * **Datacenter replication.** Have multiple data centers? Assign a leader for each data center to ensure all locations have write coverage. The challenge is that you are back to single-leader replication per data center if you don't elect multiple leaders per site. Further, writes become far more complicated since you have multiple data centers that have to resolve data changes. That requires costly RPC calls to copy logs from one data center to the other. * **Offline applications.** What if your mobile phone were a data center? When you're offline, you store changes to your local database. When you get back online, you sync your changes with a leader at a data center for persistence and backups. This is multi-leader replication taken to the extreme where every device acts like a leader. * **Collaborative applications.** Tools like Google Docs allow you to edit concurrently with other users. Think of your browser as a leader where you publish changes locally and see them instantly in the browser. In the background, you take those changes and push them to a remote leader replica. This replica syncs your changes with the changes of another individual. They ensure they all match nicely. Everything sounds great - so what's the catch? It should be obvious at this point, but the big issue with multi-leader replication is **write conflicts**. With multiple leaders comes multiple nodes where you can write changes to your database. Who has the correct answer? Is there a leader of leaders? The easiest way to mitigate write conflicts is to simply **avoid having them in the first place**. This may seem like a cop-out answer, but you can design your systems toward conflict avoidance. Having one leader per data center and routing users to specific data centers is a way to avoid conflicts. Specific updates in a particular range of data go to dedicated leaders. Of course, this reverts to being a microcosm of single-leader replication, so if a data center goes down, you are forced to reconcile concurrent writes on different leaders. If you can't avoid conflicts, you can use some _tie-breaker heuristic_ such as assigning UUIDs to data centers or records so that the highest numbered ID wins. Simple, but prone to data loss. In theory, this also allows you to merge conflicting values, but then you'll need some sort of user intervention to reconcile the concatenated data. Of course, you could implement your own custom conflict resolution scheme. Amazon has done with this some surprising effects, such as having items reappear in users' shopping carts after they have purchased items. While a suboptimal user experience, it is still likely better than losing data. How do leaders communicate with each other? The book identifies three specific _topologies_ for communication, and I see there are only two: * **Every leader talks to every other leader.** This ensures everyone gets the message from everyone else with a direct line of communication between every leader. The problem here is there is no causality to the writes because there is no order with which leaders are updated. As opposed to... * **Leaders update each other in a chain.** An example of this would be a circular or star-shaped topology. In these schemes, there is an ordering to how leaders communicate with each other which solves the causality problem. The challenge is that if a replica fails during the update chain, it breaks the chain, and all downstream updates will stall out. If a leader is early enough in the chain, most other leaders could suffer from data loss and stale information. As mentioned before, the best course of action is generally to avoid conflicts in the first place. Many database systems support some kind of conflict resolution scheme, but none are perfect. ### Leaderless replication The last major strategy with replication is leaderless replication, where **every node can accept reads and writes.** Amazon Dynamo is the most famous example of leaderless replication and has ushered in a resurgence of this strategy. Cassandra, Riak, and Voldemort have also adopted leaderless replication schemes. This scheme is ideal for applications that require **high availability and low latency** at the expense of a very loose definition of **eventual consistency and stale reads**. When every database replica can write data, you open yourself up to lots of staleness. There are two primary strategies for dealing with this stale data: 1. **Use subsequent reads to detect stale data.** The next time a user reads data, read it from multiple replicas. If any of those are stale you can issue follow-up writes to ensure the stale nodes are brought back up to the latest versions of the data. This is commonly called _read repair_. 2. **Run a heartbeat to detect stale data.** Every update comes with a version number. Run a background process that pings every node to see what version of the data it has. If any of the nodes are less than the latest version number, update the databases. Continue to ping all databases in regular intervals to ensure staleness is mitigated promptly. Leaderless replication schemes often implement _quorums_ on reads and writes to create consensus when there is a discrepancy. This is a tuneable threshold that allows developers to configure at exactly which point a vote passes to reconcile when replicas argue over which version of the data is the latest. **Quorums really only work if you have sufficient replica nodes to break a tie.** This requires votes from read replicas, write replicas, and ensure those votes outnumber the number of replicas that debate the values. As long you have a majority vote on the state of reads and writes, you can proceed with quorum voting. Further, **monitoring is difficult with leaderless replication**. This is because there is no universal order to how data is consumed. If a node goes down, or worse, several nodes go down, you risk destroying the quorum. A concept known as **sloppy quorums** was designed to mitigate against this. It states it is better to **write the data even if a quorum is not achieved** when the required nodes for a quorum vote are down or offline. They still ask for the designated number of votes, but they may not come from the approved pool of original nodes that belong to the quorum. Without the approved set of nodes, you have fewer guarantees into the latest state of the data since these nodes are not as closely monitored. This method _increases write availability_ but require further mitigation like _hinted handoff_ to restore the quorum node team. It also requires backup plans like read repair to ensure that restored nodes eventually receive the latest updates. Another issue you can run into with leaderless replication is **concurrent writes**. A single change is propagated across multiple replicas. Each replica is subject to its own network latency and IO. There is no guarantee that any write will arrive at all replicas at the exact same time. How do you reconcile this case? **Last write wins** is a simple strategy that dictates that the latest write is the one that will persist, and all other writes for the same data are considered stale duplicates that can be dropped. This sounds great in theory, but in practice, this is _bad for durability_. Say your bank account reads $1. If you somehow issued 2 concurrent updates, mistakenly adding $2 instead of $4, you may run into trouble. Both updates will say they were completed successfully. But if the $2 change arrives after the $4 change, your bank account will update to $3 even if that isn't ultimately the correct answer. Cassandra mitigates this with a UUID attached to every write update to ensure seemingly concurrent operations are indeed differentiated. There are numerous algorithms discussed in the book to determine concurrency. **Concurrency is not concerned with things happening at the same _time_ but happening _without knowing about each other_**. It's extremely difficult to say with certainty that two events occurred at the _exact_ same time. But they could occur in a close enough interval to not know about each other based on the state of the application. Riak knows how to merge concurrent values with its _sibling merge_ functionality. Riak also uses _version vectors_ to issue read repair so that clients can retrieve them on reads and return them back to the databases on writes. My takeaway here is if you are looking for **robust leaderless replication, look no further than Riak**. --- That wraps up the chapter on replicating distributed data. Another way to distribute data is through partitioning. This is the topic of the next chapter so stay tuned! ]]> 2021-07-20T23:30:00+00:00 Storage and retrieval https://www.adamconrad.dev/blog/storage-and-retrieval/ Mon, 12 Jul 2021 22:33:00 +0000 https://www.adamconrad.dev/blog/storage-and-retrieval/ This article is part of the [Systems Design Series](/tag/systems-design/) following the book [_Designing Data-Intensive Applications_](https://dataintensive.net/) by Martin Kleppmann. If you missed the [previous article](/blog/data-models-and-query-languages/), check that out first. This chapter is all about getting and sending data. There are two main kinds of data stores: 1. OLTP. This includes your traditional relational and NoSQL databases. They comfortably store terabytes of data, are highly available, and are designed for the end-users of your applications. 2. OLAP. This includes data warehouses from names like Oracle and IBM Db2. They store into the petabytes of data, are primarily read-only copies of several OLTP databases, and are designed for business analysts. One thing to note about OLAP data stores is the simplicity of their data models. While the entire [last chapter](/blog/data-models-and-query-languages/) was dedicated to the variety of OLTP data models, there are really only two models in use for OLAP systems: 1. _Star schemas_, named after the star-like shape representing the central fact table that points out to supporting dimension tables 2. _Snowflake schemas_, which add on sub-dimensional tables to the above star schema with a more complex, snowflake-like pattern The seemingly obvious comparison is that snowflake schemas are more complex because of their additional normalization with sub-dimensional tables. Conversely, star schemas are simpler to conceptualize with less branching. Either way, both support lots (100+) of columns. ## Column-store databases for big data Wide-column fact tables benefit from column-store databases. As the name suggests, you're inverting the way data is stored and retrieved. Rather than store data in rows, where each entry corresponds to a single object and various properties of that object in columns, you store all of the values for a single property together. Since a massively wide fact table is likely only utilizing several columns in a given query, data organized by column lends itself well to these kinds of analytics queries. You can partition and shard data by column which makes searching against particular columns even more efficient. Since all data in a given column adheres to the same type, you can think of a column as one giant array. This makes column-store databases effective at compressing data with techniques such as bitmap encoding. With more stuff able to fit in memory, column stores can effectively use computer caches to leverage lots of CPU cycles to iterate through data. You can further improve efficiency when grouping or filtering by sorting your columnar data. Finally, column stores can leverage highly specialized materialized views called data cubes. Data cubes make it easy to reference a lot of aggregated data. This is a way to precompute data and display it for fast retrieval. All of these make reads incredibly efficient for even the largest data sets. The clear downside is the cost to write data. A table with 500 columns might now be spread out across 500 files, one for each column. That means adding a new entry means writing 500 properties to 500 files. If that entry just happens to max out the file size on disk you'll have to perform even more writes to partition and segment out your data. In other words, writing is expensive. As we'll see later on in this post, there are tools like LSM-Trees that can effectively speed up data storage. ## Data retrieval techniques Storing data is straightforward. Write data to an append-only log and save the file to disk. Retrieving data, on the other hand, that's where things get interesting. The book presents a progressively complex way of retrieving data efficiently. ### Hash index _Indexing_ is the general strategy for retrieving data quickly. By writing only the data you search against into memory you trade space for speed. The downside, besides the additional space requirements, is the cost to write the data. Data stored in multiple places requires multiple writes to disk and takes more time. One way to index is with a hash code, just as hash maps/tables relate keys to values. An example would be Bitcask. Bitcask is the storage engine powering Riak. As we saw from the [YouTube](https://www.youtube.com/watch?v=bUHFg8CZFws) example, we need to count video views. The video offers Cassandra as the data storage mechanism of choice. The book might suggest that Riak with Bitcask might be the ideal choice. Bitcask uses an in-memory hash table separate for what is stored in disk. This, in turn, makes frequent writes still fast on read. Everything about this sounds great, right? We have a method that is: * _Scalable_: we can shard the data on disk while keeping the indexes in memory * _Fault-tolerant_: the database can go down in a separate data center that stores the index * _Eventually consistent_: indexes can now be updated independently of the data stored on disk * _High throughput_: the right hashing strategy limits collisions which allow for a lot of conflict-free storage There are two limitations with hash indexes: 1. _Slow range queries on indexes._ Since the index key is a code and the codes are unique you don't gain any speed searching a range of values. 2. _Size limitations._ If you run out of memory the whole thing falls apart. Extreme-scale applications that fill a hash table index completely are the one exception to scalability. ### SSTable The next evolution of hash indexes is Sorted String Tables, also called SSTables. SSTables provide some additional benefits: * They only store keys once, saving on storage space. * They sort the keys to further speed lookups. * They use a variant of merge sort to merge and compact segments. * The organization of that data can be implemented as a red-black or AVL tree, allowing you to write in any order and read in sorted order. This makes it easier to store in memory. I didn't touch on segments earlier. A _segment_ is just **a means of sectioning off a data log**. To prevent running out of space you need a means of separating your indexes across files. To reduce the space requirements we _compact_ the segments by removing duplicate keys and only keeping the most recent write to that key. This implementation of an in-memory balanced tree for keeping a cascading tier of SSTables is aptly called a **Log-Structured Merge-Tree** or LSM-Tree for short. Lucene, the [Apache search library](https://lucene.apache.org/) that powers ElasticSearch and Solr, uses something like an LSM-Tree to store its dictionary of terms and SSTable-like files to map those terms to postings. More specifically, if you're dealing with a dictionary of words you'd want to use a structure that tailors itself to strings instead of numbers. We've already evolved past a hash table so that is out. A self-balancing binary tree is out, too. Suffix/prefix trees like [tries](https://en.wikipedia.org/wiki/Trie) are specific to strings but suffer from the same problem of taking up lots of space and eating away at RAM. So what's left? That's where the mighty [Bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) comes to the rescue. A Bloom filter is a probabilistic data structure that is sort of an inversion of other data structures. Rather than using the information to find if data is contained within a structure, a Bloom filter tries to determine if something is definitely not in the set or if it might be in a set. The "might" part here is key because it's not guaranteeing validity. There is a chance for a false positive. With a sufficient number of hashes, you can obtain a low false-positive rate to obtain extremely accurate results with a very low amount of indexing space. This solves our biggest problem with indexes when the data is really large and you're worried you will run out of in-memory storage for your index. A Bloom filter can handle extreme scale and still perform well with the tradeoff that your results aren't always 100% accurate. ### B-tree We've talked about efficiently retrieving data in memory. What if you want to retrieve data on disk? The [B-tree](https://en.wikipedia.org/wiki/B-tree) is the most widely-used and heavily studied database index lookup tree. I had to build one of these for my relational database class in college. It’s another self-balancing tree structure used in all of your major RDBMSs like Postgres and MySQL. While both LSM-Trees and B-trees contain key-value pairs sorted by key, the strategy for separating data is completely different. The Log-structured indexes can be of variable size. They can also be quite large, spanning into multiple megabytes. B-tree indexes, by contrast, are always fixed blocks of typically 4 KB or more. Since disk pages are also aligned in fixed blocks, the B-tree is better suited for sorting and retrieving data on disk instead of in memory. B-trees are structured in layers so that the lower levels represent the range of values in between two values in the parent layer. As we saw earlier, in-memory indexes suffer from poor performance in range queries. B-trees do not. The _branching factor_ determines how wide each layer of the tree is before the range must be split down to a lower layer. For conventional databases with these 4 KB pages, you could have a branching factor of only 500 and only need to traverse down 4 levels before you have stored 256 _terabytes_ of data. That's a flat and efficient tree! In terms of system design, how does this compare in performance to the in-memory solutions? * It is also scalable because the partitions are small, 4 KB pages. * The data structure inherently supports range queries as a result of the branching factor. * Fault tolerance is achieved through a write-ahead log which writes the data changes to the WAL before it is written to the B-tree (hence, write-_ahead_). * Locks on the B-tree called latches can be introduced to preserve consistency in an ACID-compliant manner, achieving a higher level of consistency than with a log-structured index. When comparing log-structured against B-tree indexes, the main rule of thumb is to **consider performance on reads or writes**. Though performance testing will confirm this for your implementation, log-structured indexes are thought to be better for writes. Conversely, B-trees are considered better for reads. This is because LSM-Trees frequently write, segment, and compact their SSTables to create efficiency. On the other hand, B-trees offer a more reliable indexing scheme because the keys are only written once and do not suffer from a high level of writes due to compaction. The pages are separated into small, predictable pages and do not require sizable computation for write compaction and segmentation. This is particularly useful for transactional databases when ACID guarantees are a must. That would explain why B-trees have stood the test of time for so long. ### Other indexes The book offers a few more indexing suggestions, such as secondary indexes for joins, clustered indexes where the value is a heap file rather than the raw row data, multi-column indexes like [R-trees](https://en.wikipedia.org/wiki/R-tree), and fuzzy-search indexes like those we talk about with ElasticSearch and Solr. Then some indexes are colocated with the whole database in memory, such as the case with Memcached or Redis. Now the entire thing, both the index and the raw data, are stored in memory. You only need a hard disk for fault tolerance and durability. Further, in-memory databases allow for more flexibility in the data structures they utilize. It is not uncommon to interface with a priority queue or set with something like Redis. ]]> 2021-07-12T22:33:00+00:00 Data models and query languages https://www.adamconrad.dev/blog/data-models-and-query-languages/ Thu, 08 Jul 2021 22:33:00 +0000 https://www.adamconrad.dev/blog/data-models-and-query-languages/ This article is part of the [Systems Design Series](/tag/systems-design/) following the book [_Designing Data-Intensive Applications_](https://dataintensive.net/) by Martin Kleppmann. If you missed the [previous article](/blog/scalability-reliability-maintainability/), check that out first. This chapter focuses on how you can organize and retrieve data from a database. It primarily focuses on the three primary data modeling paradigms: 1. Relational 2. Document 3. Graph ## Relational model This is the incumbent. The relation is the _table_. The individual datum within the table is the tuple called the _row_. Properties of a table exist on the _column_. You know, like any other table you would see in the real world. I grew up with these database systems. They are what I studied in college. My master's thesis was on RDBMS. They have been the dominant force in data management for most of the last 40 years. The most cited open-source examples of relational databases are **MySQL** and **Postgres**. They're still useful and are the first choice for many frameworks like Ruby on Rails or Django. It's just that they suffer from a few problems: * **They cannot scale without considerable effort.** They are much harder to replicate or partition than they should be. * **Limited expressiveness.** Querying and organizing relational schemas fits a specific mold. You have to be okay with that or else you'll have to seek alternatives. So what are the alternatives? ## Document model and the rise of NoSQL NoSQL (which people would now call "Not Only SQL") describes any kind of database that doesn't follow the relational model. One thing I want to highlight is the term _polyglot persistence_. This refers to the idea that **you can have both relational and nonrelational databases working together**. It's not like you have to choose one paradigm over the other. In fact, it might be advantageous to use both for your application to serve different needs. I think this is important to call out because NoSQL implies that you're specifically _not_ choosing SQL. It's as though they can't work together in an application but in fact, they can and do. * something like a resume could be a series of joins between a user and their job experiences OR you could nest those joins into one cohesive JSON model * just one thing to fetch * it actually looks like the thing you're representing (a resume _is_ a document) - also means many-to-one and many-to-many relationships don't work so well * fast: requires no joins since it is self-contained (also means poor support for combining disjointed data) * highly flexible: no federated schema (also means no referential integrity) ### How do you know the document model is right for you? Are you working with documents like profiles, resumes, HTML pages? It's great. If you can populate that whole document it's very efficient. If you have a lot of nested relationships or links to other documents it starts to become cumbersome. A list of jobs might be two separate tables in a relational model. In the document model, they will all be listed out. The exception is if you have created a separate document for all job types. At that point, you're heading in the wrong direction. One-to-many, tree-like data structure? Good. Many-to-many, graph-like data structure? Bad - stick with a relational or graph database. So when does something look like a document in the real world? The two often-cited examples I see are: * **CMS**. The primary object of a blog engine is a blog post which is a document. The important thing is that it is _unstructured_. The content of the blog doesn't need to fit a particular format. * **E-commerce product catalog**. Imagine a service like Amazon. They have _tons_ of products. Other than the SKU there is really no mandatory format for structuring how you display a product. This falls into the _large volumes_ of _unstructured_ data category which could be a good fit for a document model database. ### Mongo is to Python as Postgres is to Java This analogy just kept swimming in my brain. One of the key benefits (and drawbacks) of a document database is its loose constraints on schemas. The book makes an interesting analogy in that it's like runtime compilation: no schemas are validated on writes on the way into the database but you can validate a schema on read. This means you can only be assured of the structure of the document when it is being read from the database. #### SQL is to CSS as JSON is to the DOM In a similar but not nearly as clean analogy, we can further think of the paradigms of database programming languages. Relational databases use declarative SQL like CSS. They optimize around _patterns_ rather than _algorithms_ because you just _declare_ how you want to filter out your data and you let the language choose the algorithm. This allows for terse, parallelizable code where the language can handle performance optimization and you focus on just getting your data. Since NoSQL databases don't necessarily use SQL their method of storage and retrieval revolves around their documents, commonly stored in formats like XML and JSON. In theory, you can use any old imperative programming language like JavaScript to interact with the database. This is nice in theory since you are programming UIs and business logic in the same programming paradigm. The downside is that your code is likely more verbose, uglier, and requires a specific algorithmic sequence to execute. Think of it like styling a page with JavaScript; you could - but why would you do that to yourself? ### Writing is not my strength... Reads are the name of the game here. **Small, unmodified documents are best.** Anything else you risk heavy performance penalties. Documents get rewritten as a whole so minor updates to big documents are a huge drain on performance. The benefit of having all of the data shoved into one document is to remove the need for complicated things like joins. The drawback is without joins you have to colocate data. This can become expensive if the document becomes expensive. ### ...and it may not matter Postgres supports JSON columns. Mongo supports foreign references in a weird sort of join operation. The truth is, the document and relational models may not be all that different soon so you aren't locked into one idiom between the major database providers anymore. The book goes into an example of this with the MapReduce extension on MongoDB. I didn't think the example was worth stating here but the last sentence on this topic is a great summary of what I took away from this section: > The moral of the story is that a NoSQL system may find itself accidentally reinventing SQL, albeit in disguise ## Graph models Remember how earlier I said one-to-many relationships were a good fit for document databases and many-to-many were good for graph or relational databases? Now let's pick that apart further. If your many-to-many relationship is simple, you can probably get away with a relational database. **For complex many-to-many relationships, a graph database is a good choice.** What's an example of a complex many-to-many relationship? You already know canonical examples: * **Social networks.** People know lots of other people. Just ask [Kevin Bacon](https://en.wikipedia.org/wiki/Six_Degrees_of_Kevin_Bacon). * **The web.** Web crawlers are designed to demystify the tangled mess of connecting and indexing every website. * **Transportation.** Roads, railroads, and airports all connect points across the globe. This is a [hard problem](https://online.stanford.edu/courses/soe-ycs0012-shortest-paths-revisited-np-complete-problems-and-what-do-about-them) and one I've [explored before](/blog/np-completeness/). To be honest, I found this section to be a bit dense on specific kinds of graph databases that may not be relevant today so I'll gloss over each section with the critical parts I took away. ### Property graphs and Cypher [Neo4j](https://neo4j.com/) is _the_ example I see all over the web on graph databases. I think if you just knew this one and [Amazon Neptune](https://aws.amazon.com/neptune/) (which seems to support all kinds of graph database models) you'd be fine. **This is how I would imagine you would design a graph data structure at scale**. There are two main objects in a property graph DB: the `vertices` and the `edges`. Like a database, each of those things has a UUID. They also have information as to where they go and who uses them. And as the _property_ name suggests, each of these objects also has metadata in the form of key-value pairs. The power here is in the _flexibility to build graphs_ and the _tagging you can store to find your vertices and edges_. Property graphs don't restrict how you construct your graph. **Graphs don't require a schema other than the basic vertex and edge**. So you're free to design it however you like. The power of the model is in _how you relate things_. These key-value tags make it easy to find information in the graph _without ruining your data model_. It also means **you don't have a traverse a graph to find information**. Tags and IDs allow you to quickly lookup vertices or edges without having to go through all of the graph connections. They're like the shortcuts to traveling across the network. **If you need to evolve your features it doesn't ruin the data**. The key-value pairs allow you to create new associations between your core objects without your graph fundamentally having to change. Sounds like Facebook would be a good example of this: people are your edges and their connections are the vertices. In the old days of Facebook, you could list out your favorite bands or movies. A property graph database would allow you to add something like TV shows without having to totally screw up your existing graph or its associations. **Cypher is to property graphs as SQL is to relations**. That's all you need to know here. It's a declarative programming language for Neo4j in the same way SQL is a programming language for databases like MySQL, Postgres, MariaDB, and SQLite. ### Triple-store graphs and SPARQL This is another graph database variant. Instead of vertices and edges as the primary objects a triple-store graph uses, you guessed it, three objects: `subjects`, `predicates`, and `objects`. This one intuitively makes less sense to me than a property graph because I don't think of a graph as I think of an English sentence. Here is how you make sense of these three things: * **Subjects are always vertices**. They are nouns and stuff like _people_ and _fruit_. * **Objects and predicates have 1 of 2 states**. 1. **The object is another vertex**. Then the _predicate is the edge_ connecting `subject` and `object` together. 2. **The object is a primitive type**. Then the _predicate is a property_ where the `object` is a data type of the `subject`. The book lists [Datomic](https://www.datomic.com/) as an example of a triple-store graph but it doesn't market itself as a graph database but an amalgamation of many flexible data models. For practical purposes of this audience, you’re likely not to run into a scenario like this. **SPARQL is to triple-store graphs as Cypher is to property graphs**. SPARQL is the programming language for triple-store graph databases using the [RDF data model for the semantic web](https://www.w3.org/RDF/). Given that the Semantic Web never really took off, it's more of a reason why you don't need to invest more than a few paragraphs into understanding these things. ## Further reading and study Since this was a much meatier chapter, I found I had some additional notes from the previous [walk-through](https://www.youtube.com/watch?v=bUHFg8CZFws) that was relevant here. There is another video [in that same series](https://www.youtube.com/watch?v=iJLL-KPqBpM) I used for practice this week. I found myself referencing a few articles which may be helpful in understanding NoSQL databases: * https://shopify.engineering/five-common-data-stores-usage - Shopify did a nice job of simply explaining the use cases for five different kinds of data stores. The segmentation is a bit odd here and it doesn't align with what I've read in the book but it's still factual information even if the categorization is faulty. * https://www.prisma.io/dataguide/intro/comparing-database-types#relational-databases-working-with-tables-as-a-standard-solution-to-organize-well-structured-data - This Prisma article was great because it expanded my knowledge even further beyond NoSQL databases to NewSQL, multi-model, time series, and more. * https://blog.nahurst.com/visual-guide-to-nosql-systems - I like that this post has a diagram that revolves around the CAP theorem. It helped me place where NewSQL fits and how to actually tell the difference between HBase and Cassandra. * https://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis - Another mega list of NoSQL solutions which categorizes based on popularity rather than storage type. I actually really liked this because it simplified my thought process to a pretty rock solid list: 1. **Want relational?** Use Postgres. 2. **Want an in-memory key-value store?** Use Redis. 3. **Want relational without the schema?** Use MongoDB. 4. **Want web analytics?** Use Cassandra. 5. **Want a full-text search?** Use ElasticSearch. 6. **Working with documents, specifically a CMS or CRM?** Use CouchDB. 7. **Building a search engine or log analysis?** Use HBase. 8. **Working with graphs?** Use Neo4J or OrientDB. ### System design tradeoffs between SQL and NoSQL solutions A few questions you have to ask yourself when comparing SQL and NoSQL solutions as they relate to scalability: #### How do you scale reads? Writes? For SQL solutions, you'll want to shard servers when the database runs out of space. Use a cluster proxy to abstract away the shards. Use a configuration service like ZooKeeper to ensure all shards are online and be sure to toggle backups in the event of a downed shard. For NoSQL solutions, you'll also want to perform sharing. However, there is no need for a cluster proxy because each shard knows about the other. Thus, no configuration service is needed. #### How do you make reads/writes fast? For SQL solutions, you'll want to use a shard proxy to cache results and publish metrics, and terminate long-running processes. For NoSQL solutions, no shard proxies are required because all nodes know about each other. Quorum nodes are used to ensure you don’t need to talk to every node to get a consensus on whether or not the operations succeed. #### How do you prevent data loss? For both SQL and NoSQL solutions, you'll want to create read replicas of shards in different data centers. #### How do you achieve consistency? For SQL solutions, you'll want to sync lead data to follower replica data. This leads to consistent data. It also takes a long time to process. For NoSQL solutions, data is pushed to the read replicas asynchronously. This provides eventual consistency and is fast. #### Other questions worth asking * **How do you recover data in case of an outage?** Data recovery is much harder for a relational system that sacrifices partition tolerance so you'll need to focus more on cold storage backups while NoSQL solutions are much easier to replicate their replicas across centers. * **How do you ensure the data is secure?** I only saw mentions of Accumulo and Riak specializing in security. Otherwise, every database needs to worry about this regardless of storage type. * **How extensible is the DB to an evolving data model?** Evolving data models are better suited for a graph database than a document database. * **How easy is it to run on/off-prem?** This question was asked in the systems design video from last week but I don't think most companies really need to worry about on or off-premises solutions anymore. * **How much will it cost?** Most solutions these days are open-source. The cost will come down to your strategy for your horizontal scaling and partitioning. ]]> 2021-07-08T22:33:00+00:00 Scalability, Reliability, and Maintainability https://www.adamconrad.dev/blog/scalability-reliability-maintainability/ Mon, 28 Jun 2021 15:28:00 +0000 https://www.adamconrad.dev/blog/scalability-reliability-maintainability/ OLTP, random reads, random writes, and write-heavy interactions (Postgres, MySQL) * **Column:** good for OLAP, heavy reads on few columns, and few writes (HBase, Cassandra, Vertica, Bigtable) * **Document:** good for data models with one-to-many relationships such as a CMS like a blog or video platform, or when you're cataloging things like e-commerce products (MongoDB, CouchDB) * **Graph:** good for data models with many-to-many relationships such as fraud detection between financial and purchase transactions or recommendation engines that associate many relationships to recommend products (Neo4J, Amazon Neptune) * **Key-Value:** essentially a giant hash table/dictionary, good for in-memory session storage or caching e-commerce shopping cart data (Redis, DynamoDB) 2. **Search indexes** (ElasticSearch, Solr) 3. **Caches** (Memcached, Redis) 4. **Batch processing frameworks:** good if data delay can be several hours or more and you can store all the data and process it later (Storm, Flink, Hadoop) 5. **Stream processing frameworks:** good if data delay can only be several minutes and you need to store data in aggregate while processing data on-the-fly (Kafka, Samza, Flink) ### Systems design in a nutshell Approaching systems design falls into 5 steps: 1. **Functional requirements** are designed to get us to think in APIs where we translate sentences into verbs for function names and nouns for input and return values. 2. **Non-functional requirements** describe system qualities like scalability, availability, performance, consistency, durability, maintainability, and cost. 3. **High-level designs** present the inbound and outbound data flow of the system. 4. **Detailed designs** are for the specific components you want to focus on. Focus on the technologies you want to use for the data you want to store, transfer, process, and access. 5. **Bottlenecks & tradeoffs** ensure we know how to find the limits of our designs and how we can balance solutions since there is no one singular correct answer to an architecture. ### Requirements gathering When asking questions regarding a product spec for a large-scale system, focus on these 5 categories of questions: 1. **Users:** Who are they? What do they do with the data? 2. **Scale:** How many requests/sec? Reads or writes? Where is the bottleneck? How many users are we supporting? How often/fast do users need data? 3. **Performance:** When do things need to be returned/confirmed? What are the tolerance and SLAs for constraints? 4. **Cost:** Are we optimizing for development cost (use OSS) or operational/maintenance cost (use cloud services)? 5. **CAP theorem:** Partitioning is something you know you'll have to account for with highly-scalable systems. So it may be easier to ask what is more valuable: consistency or availability? **If consistency is most important**, consider an ACID database like Postgres or even a NewSQL database like CockroachDB or Google Spanner. **If availability is most important**, consider a BASE database like an eventually consistent NoSQL solution such as CouchDB, Cassandra, or MongoDB. Even better is to [use this diagram](https://blog.nahurst.com/visual-guide-to-nosql-systems) to map your concerns onto a pyramid. Given that you can only ever expect 2 of the 3 parts of the CAP theorem to be satisfied it might actually be better to ask _which property is least important?_ If it's... * **Consistency** - most NoSQL solutions will work like Cassandra, CouchDB, and Amazon Dynamo * **Availability** - some NoSQL solutions and some NewSQL solutions like Bigtable, HBase, MongoDB, Google Spanner, and Redis * **Partition tolerance** - any relational or graph solution like Postgres or Neo4j will work since these are notoriously difficult to partition compared to the other solutions Though likely everyone [misunderstands the CAP theorem](http://pl.atyp.us/wordpress/?p=2521) so I would read this a few times and internalize the example. ### The three system qualities in 1 line This chapter can effectively be summarized in 3 sentences: 1. **Scalability** determines if this system can _grow_ with the growth of your product. The best technique for this is _partitioning_. 2. **Reliability** determines if this system produces _correct results_ (nearly) each and every time. The best techniques for this are _replication_ and _checkpointing_. 3. **Maintainability** determines if this system can _evolve_ with your team and is easy to understand, write, and extend. ### Further reading and study As I said before, this is a pretty simple chapter. I also watched [this systems design walkthrough](https://www.youtube.com/watch?v=bUHFg8CZFws). This video extended these concepts and informed some of these notes. I like to accompany learnings with practice to seed new questions for our own [interview process](https://indigoag.com/join-us#openings). [This article](http://highscalability.com/youtube-architecture) on YouTube's architecture further reinforces the sample problem on the YouTube video (how meta). You can check your solution against the one that was really used by YouTube. Finally, you can rifle through a bunch of [these videos](https://www.youtube.com/playlist?list=PLMCXHnjXnTnvo6alSjVkgxV-VH6EPyvoX) fairly quickly as each touches on a small subset of system design techniques. Check-in next week with a summary of Chapter 2 of the book: _Data Models & Query Languages_! ]]> 2021-06-28T15:28:00+00:00 Effective messaging for candidates https://www.adamconrad.dev/blog/effective-messaging-for-candidates/ Wed, 16 Jun 2021 20:28:00 +0000 https://www.adamconrad.dev/blog/effective-messaging-for-candidates/ 2021-06-16T20:28:00+00:00 How to manage remotely and stay sane https://www.adamconrad.dev/blog/how-to-manage-remotely-and-stay-sane/ Mon, 14 Jun 2021 12:34:00 +0000 https://www.adamconrad.dev/blog/how-to-manage-remotely-and-stay-sane/ 2021-06-14T12:34:00+00:00 Becoming an engineering manager https://www.adamconrad.dev/blog/becoming-an-engineering-manager/ Wed, 09 Jun 2021 21:00:00 +0000 https://www.adamconrad.dev/blog/becoming-an-engineering-manager/ **Schedule a handover meeting with the current team manager**. Items covered should include recent feedback, projects, individual goals, and any shared documents or performance reviews most recently given. * **Schedule 1:1 meetings with team members ([Template](https://getlighthouse.com/blog/one-on-one-meetings-template-great-leaders/)) ([First 1-1 Questions](https://larahogan.me/blog/first-one-on-one-questions/))** * **Update your team and organization wiki pages to reflect the new reporting structure** * **Request access to any team chat channels you should be in** * _Create a README of how you work_ * _Read [Transitioning to management as a Technical Lead Manager](https://adamconrad.dev/blog/technical-lead-management/)_ ### Your First Month * **Focus on 1-1s: ask questions, build rapport, and discuss career development plans with each direct report ([Questions Reference](https://getlighthouse.com/blog/one-on-one-meeting-questions-great-managers-ask/))** * **Review your hiring process** * _Define or review measurable goals for your management role with your manager in a 1-1_ * _Modify this document to incorporate learnings you’ve had as a new engineering manager_ #### Recommended Reading List - _The Effective Manager_ by Mark Horstman (and the accompanying _Manager Tools_ podcast) - _Resilient Management_ by Lara Hogan - _High Output Management_ by Andy Grove - _The 27 Challenges Managers Face_ by Bruce Tulgan (particularly for conflict resolution) - _Accelerate_ by Nicole Forsgren, Jez Humble, and Gene Kim - _An Elegant Puzzle_ by Will Larson - _Extreme Ownership_ by Jocko Willink (and his Jocko Podcast) - _The Making of a Manager_ by Julie Zhuo - _The Manager’s Path_ by Camille Fournier - _The Talent Code_ by Daniel Coyle ### Days 31-60 * _Schedule a coffee chat with your recruiting team_ * _Begin incorporating feedback into your 1-1s ([podcast on 1-1s & feedback](https://www.manager-tools.com/2009/10/management-trinity-part-2))_ ### Days 61-90 * _Begin incorporating coaching & delegation into your 1-1s ([podcast on coaching & delegation](https://www.manager-tools.com/2009/10/management-trinity-part-3))_ * _Work with your manager on decision-making techniques to effectively delegate work and get more out of your direct reports ([Template](https://www.groupmap.com/map-templates/urgent-important-matrix/))_ ### 3 Months and Beyond * _For Technical Lead or Interim Managers: Decide with your manager if you want to continue down technical leadership (e.g. Staff engineer, Principal engineer) or transition to professional leadership (e.g. Engineering Manager)_ * _Walk through your first quarterly goal setting for your directs with your manager ([Template](https://docs.google.com/document/d/1p7-Jo45VAw-RTUK97p6r9G0y0RtbmRxIlRntrmcfwHM/edit#))_ * _Learn about the performance review process at your company in your manager 1-1_ ]]> 2021-06-09T21:00:00+00:00 How to find great engineers https://www.adamconrad.dev/blog/how-to-find-great-engineers/ Fri, 04 Jun 2021 23:29:00 +0000 https://www.adamconrad.dev/blog/how-to-find-great-engineers/ The best way to convince developers to join is to be genuinely interested in them and convince them that the role you are hiring for is __perfect for them__. ## First contact * **Set up a 30-minute phone call ASAP**. Since you’ve posted to so many communities, it can be difficult to respond and check in on them. _Encourage folks to email you to schedule a time to chat on the phone and engage with them personally_. * **Keep the phone call informal**. You really only need to focus on 4 things: * What your company is about * What _your team_ is about and how it fits into your company's grand vision. * Learn more about the candidate, what they’re looking for, and why they’re interested in you * A brief overview of the hiring process (this leads them into the next step by encouraging them to start your interview process). ## Interview unconventionally You’ve invested a lot of effort into hiring. You may not have implemented all of those strategies but **the most important thing about your interview process should be to stand out from the crowd and wow your candidates with a pleasant & fair interview experience**. **People _loathe_ contrived interview problems** (e.g. reverse a linked list in memory, traverse a 2D array in a spiral). They especially hate when those problems have nothing to do with the actual work they will be doing. So don’t ask them! **People _love_ practical questions that test the skills they use**. Interviewing someone on the UI layer? Ask them to implement a mockup. Need an architect? Have them design Twitter. If you engage candidates with problems they will really encounter, you’ll learn their true capabilities and they, in return, will feel properly evaluated (read: happy with the interview experience). **People are _impressed_ by non-traditional approaches**. I've used tiny take-home pull request exams to engage an often overlooked part of every developer’s core job: reading code. Most interviews require developers to write code, but few ask them to read code. ## Related articles * https://slack.engineering/refactoring-backend-engineering-hiring-at-slack-b53b1e0e7a3c * https://angel.co/blog/how-to-hire-world-class-engineers * https://www.freecodecamp.org/news/hiring-right-an-engineers-perspective-on-tech-recruiting-7ee187ded22d/ * https://triplebyte.com/blog/companies/outcompete-faang-hiring-engineers * https://www.intercom.com/blog/how-we-hire-engineers-part-1/ * https://www.intercom.com/blog/how-we-hire-engineers-part-2-culture-contribution/ * https://www.intercom.com/blog/inbound-recruiting-engineers/ ]]> 2021-06-04T23:29:00+00:00 Job hunting as an engineering manager https://www.adamconrad.dev/blog/job-hunting-as-an-engineering-manager/ Mon, 31 May 2021 11:04:00 +0000 https://www.adamconrad.dev/blog/job-hunting-as-an-engineering-manager/ How do you decide when and how to leave your company and be confident that you’re making the right decision? As I've mentioned in a previous article, you don't just wake up one day and say to yourself "I think I'll [leave my job](/blog/how-to-do-an-exit-interview) today!" It's a long, painful process. It's the culmination of a lot of smaller irritations, frustrations, and setbacks that leave you longing for something more on a greener pasture. And that doesn't even cover the interviews. Interviewing sucks. I won't deny that. But changing your mindset on [how to approach technical interviews](/blog/why-hiring-is-broken-and-how-im-dealing-with-it) can make all of the difference. Gergely wrote a [great article](https://blog.pragmaticengineer.com/preparing-for-the-systems-design-and-coding-interviews/) recently on what to study and how to study for the range of engineering management interview questions. It's a very complete and exhaustive list of resources that is the exact same references I would list here so there's no reason to duplicate those efforts. Just read that article and actually do all of the stuff he recommends. Yes, it will take a long time (like, months). And yes, it is difficult, painful, and arguably a giant waste of time. But as I've told lots of people before: interviewing is a skill just like anything else that you can master. **All that passing an interview proves is that you are good at interviewing**. I like to compare the technical interview process to the SAT/ACT/IB standardized exams. The best way to "win" at the SATs is to answer a lot of SAT questions. And all that the SATs really prove is that you're good at the SATs. They [don't predict success in college](https://www.forbes.com/sites/nickmorrison/2020/01/29/its-gpas-not-standardized-tests-that-predict-college-success/) and there's an argument to be made for getting rid of them altogether. See the similarities now? Regardless, both systems aren't going away any time soon. So you have two choices: don't play the game and limit your options and upside, or play the game even if you don't like the rules. With companies paying [outlandish salaries now](https://www.levels.fyi) for a field that doesn't require formal credentials or schooling, this could not be a better time to get out there and find your [perfect role](https://indigoag.com/join-us#openings). I would argue that even if you don't like the system we have, you're leaving hundreds of thousands of dollars per year on the table by not playing along. Study hard, get your name out there, and buy me a coffee when this post convinces you to get that job that's going to make you an extra $80k per year. ]]> 2021-05-31T11:04:00+00:00 It's not hard to care https://www.adamconrad.dev/blog/its-not-hard-to-care/ Fri, 21 May 2021 11:04:00 +0000 https://www.adamconrad.dev/blog/its-not-hard-to-care/ 2021-05-21T11:04:00+00:00 On business value https://www.adamconrad.dev/blog/on-business-value/ Mon, 17 May 2021 11:04:00 +0000 https://www.adamconrad.dev/blog/on-business-value/ How do you showcase the business value of the work that your team is doing? There's something that gives me pause about this question having the adjective `business` in there. If you're building a product, you're adding value to the business. What value is not business value in this context? Or put it another way: **what is valuable that _isn't_ business value for professional code?**. So I will answer this question with the broader aim to **demonstrate value of the work you do**. The easiest way to do this is to [showcase your impact](/blog/how-to-demonstrate-impact) on the organization. The best way to automate this is to set up regular OKRs every quarter that relate to business KPIs (i.e. revenue, monthly active users, customer retention). Showing impact with numbers and figures against business targets is the easiest way to bridge the gap with business folks. The more difficult, laborious way to do this is to compare the value you and your teams have generated against how much [you and your teams' cost](/blog/how-to-justify-your-cost-to-a-client). Really big companies like Facebook know **how much a developer contributes to the company bottom line** - both in revenue and the number of users they impact. That's a huge draw for some developers: imagine if you alone could make a change that affects _millions_ of people! Now what they don't tell you is that your changes probably amount to altering a button from blue to a slightly more [convincing shade of blue](https://blog.ultimate-uk.com/googles-41-shades-of-blue). So if you want to work on something that will actually move the needle on fixing the planet we live on, [we're hiring](https://indigoag.com/join-us#openings). ]]> 2021-05-17T11:04:00+00:00 Are middle managers useful? https://www.adamconrad.dev/blog/are-middle-managers-useful/ Mon, 10 May 2021 11:04:00 +0000 https://www.adamconrad.dev/blog/are-middle-managers-useful/ As a middle manager, you have several teams with dedicated managers. If you have good managers, what is your value? As a middle manager, I hope I'm useful! But I see where this question is coming from: _if you have line managers coaching the developers, what do you need a manager of managers for_? I think this question suffers from an assumption that there are only two kinds of managers; those that lead people and executives who call the shots. In reality, there are three kinds of managers and we know this to be true because it exists in the military: junior-grade, field-grade, and general-grade officers. To relate this to engineering management: * Engineering Manager is a base level manager that runs a team (military company). This is a junior officer (i.e. Lieutenant) * Senior Engineering Manager is an experienced manager that runs a larger (or multiple) teams/companies. This is a senior officer (i.e. Captain) * Director is a manager of managers (and thus teams). This is a field-grade officer (i.e. Major) * Senior Director is an experienced director. This is a senior field-grade officer (i.e. Colonel) * Vice President manages directors (and thus whole divisions). This is a general-grade officer (i.e. Brigadier General/Rear Admiral) * Senior Vice President is an experienced VP. This is a senior general-grade officer (i.e. Major General/Lieutenant General/Vice Admiral) * Executive Vice President is a top VP, and is often an interchangeable with CTO. This is the highest-ranked officer (i.e. General/Admiral) **Middle managers exist to _set direction_ and _coach line managers_.** For larger organizations, the gap is too large between the executives/generals and the line managers/lieutenants. So there needs to be folks who can relate to the junior managers who aren't so busy building visions and setting budgets with many commas after them. There are [lots of great books](/blog/teach-yourself-engineering-management) on being a great director. The first two that come to mind that I would recommend are [An Elegant Puzzle](https://www.amazon.com/gp/product/B07QYCHJ7V/ref=x_gr_w_glide_sout?caller=Goodreads&callerLink=https://www.goodreads.com/book/show/45303387-an-elegant-puzzle&tag=x_gr_w_glide_sout-20) and [The Manager's Path](https://www.amazon.com/Managers-Path-Leaders-Navigating-Growth/dp/1491973897). The first one has been my playbook as a director while the second book covers the full manager progression all the way up to VP and has a great section on directors. Check out both of those books and remember to **coach managers** and **set direction for all of your teams** and you will be extremely valuable to any organization! ]]> 2021-05-10T11:04:00+00:00 On delegation https://www.adamconrad.dev/blog/on-delegation/ Mon, 03 May 2021 11:04:00 +0000 https://www.adamconrad.dev/blog/on-delegation/ I was recently promoted to an Engineering Manager role. How do I turn off my technical know-how to move to a leadership role? First of all: congratulations! It's hard to break into the role of engineering manager so great job and welcome to the club! Second of all: know that you are not alone. This is a _very common_ refrain from new managers. In fact, one of the most memorable pieces of writing I've ever seen on the topic was an Ask HackerNews thread about switching to management. The very top comment was warning new managers to **delegate more and code less** and how this will not come instinctively to you as a new manager. It's really easy to revert to what you know; to roll up your sleeves, fire up your terminal and editor, and get to coding the problem when everyone else cannot. **Resist the urge to code whenever possible.** It's extremely counter-intuitive. You've trained, possibly for your entire career, to solve problems with code. Now I'm telling you to take all of that training and throw it in the trash. How is that possibly a good idea? I've made this analogy [before](/blog/technical-lead-management) with a whole section on delegation. To summarize: the higher up the food chain you go, the rarer your skill sets become. A CTO can do all of the things you can do but needs to focus on only the things she can do, like hiring executives and setting executive direction and vision for the engineering teams. By extension, you as an engineering manager now have a set of skills that separate you from your previous self as a developer. Therefore, let your developers do development things because only you can do certain manager things, like hiring, firing, promoting, setting technical direction, and resolving conflicts. **Focus on the things that make you valuable and differentiated so others can do what they are best at**. It may be uncomfortable at first but it is _necessary_ for your role as you transition into management. ]]> 2021-05-03T11:04:00+00:00 Selling ideas https://www.adamconrad.dev/blog/selling-ideas/ Wed, 28 Apr 2021 11:04:00 +0000 https://www.adamconrad.dev/blog/selling-ideas/ When you introduce a new project or idea to a team of developers, how do you sell that to the team to get them excited about it? Believe it or not I used to work in sales. For a little while I ran my own [consulting firm](https://anonconsulting.com) full-time (I still accept clients but for small, one-off consulting reports and advice). Running your own services business is arguably more sales than is it actual service development. So when folks ask me about sales I point them to the classic book [SPIN Selling](https://www.amazon.com/SPIN-Selling-Neil-Rackham/dp/0070511136). It's for sales people by sales people. But Adam, I'm an engineering manager. Why would I need to read a book about traditional sales to convince people to buy things? **Because everything is sales**. You sell yourself when you interview for a job or go on a date. You sell your products or [services](https://anonconsulting.com) if you've ever been an entrepreneur of a business. And most commonly, you sell _ideas_ to your teammates and stakeholders to convince and influence people to believe in the things [you believe in](/blog/dont-call-it-tech-debt). So who better to learn sales from than actual sales people? If you ever have the chance to shadow sales people at your current company, **do it**. You will not only build rapport with folks in other departments that you normally wouldn't, you'll also learn a ton about how to influence and persuade people. **Selling ideas is all about persuasion and influence.** There are books for those as well but I've got a [whole curriculum for that](/blog/teach-yourself-engineering-management/). To get you started, you really only need to remember what the SPIN acronym actually stands for: * **Situation**: where are you today? What's the current "state of the union"? * **Problem**: what are the current pain points and frustrations that your new project is aiming to solve? * **Implication**: what is the value of your solution? What would it look like if these frustrations and problems went away as a result of your solution being implemented? * **Need-Payoff**: how important or urgent is it for you to solve this problem? What are the benefits of building this solution for customers? If you can **construct a narrative** that incorporates all 4 of those kinds of questions into the new project or idea you're bringing to your teams then you are much more likely to get them energized and excited to make a real difference for your customers. ]]> 2021-04-28T11:04:00+00:00 Can you grow too quickly? https://www.adamconrad.dev/blog/can-you-grow-too-quickly/ Wed, 21 Apr 2021 11:04:00 +0000 https://www.adamconrad.dev/blog/can-you-grow-too-quickly/ If you have a small team with exponential growth, how do you make sure that the quick growth doesn’t kill the product or team? When I think of small engineering teams and hyper-growth companies I think of [Instagram](https://review.firstround.com/how-instagram-co-founder-mike-krieger-took-its-engineering-org-from-0-to-300-people). I've actually been thinking about this company a lot lately. ## A brief aside on premature optimization We've been struggling with organizational (and architectural) complexity at my job recently. Everything takes too long to build and we have seemingly too many folks tackling a few too many things at once. One of the ways we were able to convince stakeholders to simplify our architecture (and hopefully simplify our org design) was to reference the story of Instagram's engineering team. TL,DR: Instagram scaled to 300,000,000 MAUs (monthly active users) and sold to Facebook for $1B with only **six engineers**. That's ludicrous. That story is now burned into my brain. From this point forward, if anyone tells me we need to scale up or build another microservice I'm going to reference this article on how Instagram could do it on far less. **So to answer the actual question: the only way I could see a fast-growing, successful product failing is to pre-optimize and scale it before it really needs to**. ]]> 2021-04-21T11:04:00+00:00 On productivity https://www.adamconrad.dev/blog/on-productivity/ Wed, 14 Apr 2021 11:04:00 +0000 https://www.adamconrad.dev/blog/on-productivity/ As leaders, how do you calculate the productivity of team members? This one is a slippery slope. I've tried using things like the [DORA metrics](https://cloud.google.com/blog/products/devops-sre/using-the-four-keys-to-measure-your-devops-performance). I've tried more insideous calculations like lines of code, pull requests made, or bugs solved, but those all optimize for the wrong thing: more stuff. More stuff doesn't make you more productive; it just means you're doing more stuff with your day. In a recent bout of 1-1s I asked every direct this question: **Who do you think I'm more likely to promote: the person who spent 2 hours making the company $3MM with a single code change or the person who spent all year working hard to get all of our tests to 100% code coverage?** Everyone picks the first person because they know the real answer on measuring productivity: **in impact**. Lines of code, PRs merged, bugs squashed...these are all vanity metrics that are easy to track but don't really measure what you're after. Instead, encourage folks to write something like a [promotion packet](https://staffeng.com/guides/promo-packets/) on the impact they've made over the last year. A bit of self-reflection will make it easy to see whether or not a person is truly productive by any semblance of an objective measure of productivity. Any other kind of measure of productivity is a red herring and a bad road to travel down that will only lead to micromanagement and missing the point. ]]> 2021-04-14T11:04:00+00:00 Building teams https://www.adamconrad.dev/blog/building-teams/ Wed, 07 Apr 2021 11:04:00 +0000 https://www.adamconrad.dev/blog/building-teams/ How do you build teams and how do you differentiate from building a team from 0 to 1 vs. 1 to 100? Full transparency: I've never hired a first engineer for a company. But I _have_ hired the first engineer for a team. Hiring the first engineer is helpful if you **other teams they can join.** Even if that team is building something completely different, just having a sense of **camaraderie** is a huge win for people to not feel isolated when they are the first to do something. It also helps to hire **a more senior engineer** because they are more capable of handling big, ambiguous projects from scratch. All of my new teams start with the most senior person and work their way down. It also makes it easier for newer, more junior developers to learn from more experienced folks when you are onboarding and training people to learn the ways of your new team(s). Full transparency, deux: I also have never hired 100 engineers for a company either. I've hired 18 engineers from scratch for my teams and referred another 10 to other teams. I would say not much changes after the first squad of 4-6 is built. Getting those folks is key and once you have that central unit in place it's just a matter of **maintaining and expanding that culture with the same squad blueprint.** So if I'm looking at budget for the next year I try to **never hire a fractional or incomplete squad**. Obviously for some teams, budgets are too tight to fit into neat buckets like squads. But if you have the luxury of building out full units of teams then be sure to build out a whole and not a partial unit or else you're going to have trouble giving that squad **a central mission and objective to complete.** Each squad should have a vision and a purpose. If you don't have a complete squad to assemble then you don't have all the pieces you need to accomplish an objective. This will leave folks floundering between teams just to keep resources utilizes on all of the things your folks still don't have time for on their own. ]]> 2021-04-07T11:04:00+00:00 Values https://www.adamconrad.dev/blog/values/ Wed, 31 Mar 2021 11:04:00 +0000 https://www.adamconrad.dev/blog/values/ What skills (as leaders) do you value the most? I'd say it's a different answer depending on the kind of person I'm working with. I mostly interact with 4 kinds of people in my day-to-day: * Developers * Engineering managers * Product managers * Designers Sure, there are a few fundamental skills that I covet regardless of profession: **problem solving ability**, **clear communication**, **tact**, and **logic** are on the top of that list. There are other things that aren't strictly skills, like kindness and empathy, but they are worth calling out because they are attributes of people I love to work with. For developers I'd add **coding** and **systems design**. We hire against those skills and software development is such a meritocratic profession that covets our core skills. Respect is held for those that can do any not just say. For other engineering managers I value **deduction** and **intuition**. Management is more art than science so I really appreciate managers who just _get it_. A lot of that comes from experience and humility. I appreciate all of those things. A product manager's best skill is **negotiation** and dare I say **charisma**. They have so many stakeholders and so many hats to wear that everyone vies for their attention and priority. A great product manager can make everyone feel like they are the most important person in the room even if they have to make their needs a lower priority. With designers I'd double down on **empathy**. Much of a designer's job is based on user research and conceptualizing things for other humans. Empathy is even more important if your job is all about researching what your end customer needs, wants, and desires. For everyone else, resort back to the fundamental skills. Kind, independent thinkers will never go out of style. ]]> 2021-03-31T11:04:00+00:00 You only need to be in two kinds of meetings https://www.adamconrad.dev/blog/two-kinds-of-meetings/ Wed, 24 Mar 2021 11:04:00 +0000 https://www.adamconrad.dev/blog/two-kinds-of-meetings/ 2021-03-24T11:04:00+00:00 Are exit interviews a waste of time? https://www.adamconrad.dev/blog/how-to-do-an-exit-interview/ Wed, 17 Mar 2021 11:04:00 +0000 https://www.adamconrad.dev/blog/how-to-do-an-exit-interview/ 2021-03-17T11:04:00+00:00 Hitting deadlines https://www.adamconrad.dev/blog/hitting-deadlines/ Tue, 09 Mar 2021 11:04:00 +0000 https://www.adamconrad.dev/blog/hitting-deadlines/ 2021-03-09T11:04:00+00:00 On culture fit https://www.adamconrad.dev/blog/on-culture-fit/ Sun, 28 Feb 2021 11:04:00 +0000 https://www.adamconrad.dev/blog/on-culture-fit/ 2021-02-28T11:04:00+00:00 Things to remember with annual reviews https://www.adamconrad.dev/blog/things-to-remember-with-annual-reviews/ Mon, 15 Feb 2021 21:54:00 +0000 https://www.adamconrad.dev/blog/things-to-remember-with-annual-reviews/ 2021-02-15T21:54:00+00:00 How I applied the engineering management curriculum https://www.adamconrad.dev/blog/how-i-applied-the-engineering-management-curriculum/ Fri, 05 Feb 2021 22:28:00 +0000 https://www.adamconrad.dev/blog/how-i-applied-the-engineering-management-curriculum/ 2021-02-05T22:28:00+00:00 Teach yourself engineering management https://www.adamconrad.dev/blog/teach-yourself-engineering-management/ Sun, 31 Jan 2021 21:09:00 +0000 https://www.adamconrad.dev/blog/teach-yourself-engineering-management/ 2021-01-31T21:09:00+00:00 Start today https://www.adamconrad.dev/blog/start-today/ Thu, 31 Dec 2020 21:30:00 +0000 https://www.adamconrad.dev/blog/start-today/ 2020-12-31T21:30:00+00:00 A quick and dirty guide to Kanban https://www.adamconrad.dev/blog/quick-and-dirty-guide-to-kanban/ Mon, 30 Nov 2020 16:18:00 +0000 https://www.adamconrad.dev/blog/quick-and-dirty-guide-to-kanban/ 2020-11-30T16:18:00+00:00 Don't call it tech debt https://www.adamconrad.dev/blog/dont-call-it-tech-debt/ Mon, 16 Nov 2020 23:31:00 +0000 https://www.adamconrad.dev/blog/dont-call-it-tech-debt/ If you run a commercial kitchen and you only ever cook food, because selling cooked food is your business -- if you never clean the dishes, never scrape the grill, never organize the freezer -- the health inspector will shut your shit down pretty quickly. > > Software, on the other hand, doesn't have health inspectors. It has kitchen staff who become more alarmed over time at the state of the kitchen they're working in every day, and if nothing is done about it, there will come a point where the kitchen starts failing to produce edible meals. > > Generally, you can either convince decision makers that cleaning the kitchen is more profitable in the long run or you can dust off your resume and get out before it burns down. On the flip side, if you ever have the luxury of getting ahead of your technical debt then you can contribute things called [technical credits](https://www.stillbreathing.co.uk/2016/10/13/technical-credit). As you can imagine, technical credits are the opposite of tech debt: things that _pay dividends_ to your code rather than incur a cost and drag it down. [Progressive enhancement](https://gomakethings.com/progressive-enhancement-graceful-degradation-and-asynchronously-loading-css/) is touted as an example of technical credit. Marrying these two concepts, I actually think a way around the horrible marketing of tech debt can be resolved with Agile terminology that is already familiar to our product-positive counterparts: **engineering-driving work.** ## Engineering-driven work over tech debt At the end of the day, whether they be debt or credit, the work engineers want to accomplish to make their code better is something personal to them and them alone. Similarly, your product team has their own vision of what your code can become for your customers. So if we segment on the _owner_ instead of the _outcome_ we remove the negative connotations of the work involved. On my teams, we make this distinction painfully simple in Jira: 1. **Was the work engineering-driven?** Create a task. 2. **Was the work product-driven?** Create a story; use sub-tasks to create the work the engineers will do to complete that story. Everything lives on the same playing field and is up for grabs to assign priority. So rather than bucket 20% time for debt, if all engineering-driven tasks (whether they be credits, debts, or something else) are the most important work to be done this week, they take up 100% of the sprint. Similarly, if product's work is the most important, it can dominate or completely own a sprint. The important thing is that **the owner of the ticket sells why their work should be as important as it is**. If you can't sell why a critical security upgrade is vital to the health of your application then you need to learn how to sell and influence others. But then again, you might think that switching from Formik to React Hook Forms would be awesome but if it's based on personal bias and no tangible value creation for developer productivity you're dead on arrival. So I encourage all of my engineers to read sales books. Learn how to sell and influence others. Convincing someone that what you want to do matters is a much better message than starting things off on the wrong foot by calling your work debt that needs to be repaid. Debt makes people wince. Creating value makes people smile. Paying down tech debt adds value so lead with that and focus on why your work is important instead of giving it the Scarlet Letter from the onset. ]]> 2020-11-16T23:31:00+00:00 Tips to level up as a new engineer https://www.adamconrad.dev/blog/tips-to-level-up-as-a-new-engineer/ Fri, 30 Oct 2020 18:02:21 +0000 https://www.adamconrad.dev/blog/tips-to-level-up-as-a-new-engineer/ 2020-10-30T18:02:21+00:00 Don't be clever https://www.adamconrad.dev/blog/dont-be-clever/ Wed, 23 Sep 2020 22:09:22 +0000 https://www.adamconrad.dev/blog/dont-be-clever/ 2020-09-23T22:09:22+00:00 Architecture for managers https://www.adamconrad.dev/blog/architecture-for-managers/ Mon, 24 Aug 2020 22:22:22 +0000 https://www.adamconrad.dev/blog/architecture-for-managers/ 2020-08-24T22:22:22+00:00 Accountability is not responsibility https://www.adamconrad.dev/blog/accountability-is-not-responsibility/ Wed, 19 Aug 2020 00:05:00 +0000 https://www.adamconrad.dev/blog/accountability-is-not-responsibility/ 2020-08-19T00:05:00+00:00 ABF - Always Be Failing https://www.adamconrad.dev/blog/abf-always-be-failing/ Tue, 18 Aug 2020 00:19:06 +0000 https://www.adamconrad.dev/blog/abf-always-be-failing/ 2020-08-18T00:19:06+00:00 3 basics of TypeScript you probably do not know https://www.adamconrad.dev/blog/3-basics-of-typescript-you-probably-dont-know/ Sun, 26 Jul 2020 12:19:06 +0000 https://www.adamconrad.dev/blog/3-basics-of-typescript-you-probably-dont-know/ undefined; const noReturnFn2: () => void; ``` The answer is that to JavaScript, there is no difference, but to TypeScript, it's all about if you `return` or not. ```TypeScript const noReturnFn: () => undefined = () => { console.log('This must return'); return; }; const noReturnFn2: () => void = () => { console.log('This must not'); }; ``` As you can see, a function return type of `undefined` requires that you use the `return` keyword while `void` does not return anything explicitly. If we go back to our [compiler options](https://www.typescriptlang.org/docs/handbook/compiler-options.html#compiler-options) we see a suggested code quality option of `noImplicitReturns`. If we turn this on then `void` becomes a useless type because you are no longer allowed to return nothing; you _have to_ explicitly return something with all of your functions. Now in situations where you would use `void` you'll have to default back to `undefined`. Personally, I think this is better because you are being more intentional with your code but now you at least know why `void` exists in TypeScript and when you would use it. ]]> 2020-07-26T12:19:06+00:00 Don't nickel and dime offers https://www.adamconrad.dev/blog/dont-nickel-and-dime-offers/ Mon, 20 Jul 2020 22:23:58 +0000 https://www.adamconrad.dev/blog/dont-nickel-and-dime-offers/ 2020-07-20T22:23:58+00:00 How to practice frontend engineering https://www.adamconrad.dev/blog/how-to-practice-frontend-engineering/ Sun, 21 Jun 2020 11:38:34 +0000 https://www.adamconrad.dev/blog/how-to-practice-frontend-engineering/ ` is the purest representation of interactivity between your browser and the rest of the internet. Between documents and forms, that's really all we need. Everything else is just a pretty distraction designed to keep you clicking and engaged. If we remember the roots of what we are building we tend to forget the noise of the latest SPA framework or hip new bundler and instead just focus on information architecture and user experience. Don't lose sight of that. ## Additional resources Beyond hobby projects, I tell most people they should just build a real SaaS app and scratch their own itch. Push something real to production that you can charge customers for. This will have you not only practicing frontend, but backend engineering as well. More than you bargained for but there is truly no better way to practice your craft. To create front ends that scale, I'd recommend a few more resources that I've found incredibly valuable over the years that stand the test of time: * _[Front End Architecture for Design Systems](https://www.amazon.com/Frontend-Architecture-Design-Systems-Sustainable/dp/1491926783)_ - This book is the reference I use to cover architecture for even the biggest production apps I work on. Though some of the tools and technologies are dated, the tenants of architecture remain indelible. * [MDN Web Docs](https://developer.mozilla.org/en-US/docs/Web) - The gold standard for understanding everything web technologies. This is easily the most comprehensive guide and reference for all things HTML, CSS, and JavaScript. They have tutorials, reference material for every property and function in all three specs, as well as a host of great examples and polyfills. I probably reference this at least once a week. * [Google's Web Fundamentals](https://developers.google.com/web/fundamentals) - These may be fundamental but they are by no means easy or rudimentary. I still find plenty to learn even with more than a decade of experience. These documents always update to include the latest and greatest web technologies as well. * [Nielsen Norman Group](https://www.nngroup.com/articles/) - There is no better team to learn user experience from. This UX firm has authored some of the best selling books on the topic are considered the foremost experts in their field. * [Refactoring UI](https://refactoringui.com/) - This book, and their [free collection of tweets](https://refactoringui.com/) has single-handedly made me a better designer more than any other resource. If you want to create beautifully-designed sites with the tools you already use, you _have_ to check out this book and site. ]]> 2020-06-21T11:38:34+00:00 Organizational design for marketplace companies https://www.adamconrad.dev/blog/engineering-orgs-for-marketplaces/ Tue, 21 Apr 2020 14:09:19 +0000 https://www.adamconrad.dev/blog/engineering-orgs-for-marketplaces/ 2020-04-21T14:09:19+00:00 Can you turn around a dysfunctional team? https://www.adamconrad.dev/blog/can-you-turn-around-a-dysfunctional-team/ Tue, 21 Apr 2020 14:09:19 +0000 https://www.adamconrad.dev/blog/can-you-turn-around-a-dysfunctional-team/ k you make people really uncomfortable. Like it or not, people assume the worst and will analyze the crap out of a vague message. Was it dismissive? Does "k" actually mean acknowledged or go away? To fix this, you probably need a page in your wiki on something to the effect of [Slack etiquette](https://slackhq.com/etiquette-tips-in-slack). For some, this may seem ridiculous that in this day in age we still need to train adults on how to play nicely with each other. But we have to eat our own dog food here; you have to assume positive intent yourself that maybe some people just don't realize they look mean and nasty even if they don't intend to and they genuinely have never had issues before communicating in such a Spartan fashion. If that's the case, your etiquette guide serves as a friendly primer. ## All is not lost If we look at all of these habits above, they apply to high-functioning team and can be used to transform dysfunctional teams. **It just takes work.** You need to put the effort in to transform an under-performing team. You also have to have the patience to see it through. Not everyone has the stomach to [turn the ship around](https://www.youtube.com/watch?v=ivwKQqf4ixA) but if you're in this position there is hope for you yet! ]]> 2020-04-21T14:09:19+00:00 Working Remotely When You Didn't Plan To https://www.adamconrad.dev/blog/working-remotely-when-you-didnt-plan-to/ Thu, 27 Feb 2020 23:00:19 +0000 https://www.adamconrad.dev/blog/working-remotely-when-you-didnt-plan-to/ 2020-02-27T23:00:19+00:00 Technical Lead Management 101 or How to Try Out Management https://www.adamconrad.dev/blog/technical-lead-management/ Thu, 27 Feb 2020 23:00:19 +0000 https://www.adamconrad.dev/blog/technical-lead-management/ I wrote this for my tech leads who needed a crash course in engineering management. The advice you see in this article is the advice I used to help level up my team leads. At some point in your technical career you'll have to make a choice: stay on the individual contributor (IC) track or move into management. Many seem to believe that the latter is the only way to move up in your career. It's not! Plenty of people want to (and deserve) to grow their careers as software developers without having to manage others. **Regardless, at some point you'll have to test the waters to see which is right for you.** To ease this transition and minimize the risk of inundating developers with too much change, we've created a hybrid role called _Technical Lead Manager_. In this article, we're going to describe what this role is, why it is needed, and the basics for becoming a technical lead with management responsibilities. ## What is a Technical Lead Manager? While there is no official definition of this role, in my mind **a Technical Lead Manager (TLM) is a tech lead with minimal engineering management responsibilities.** More specifically, we define "minimal" to mean **no more than 3 direct reports.** If you're brand new to engineering management, a _direct report_ means someone (e.g. a software engineer) who reports to you in your company's organizational hierarchy. In other words, you write code _and_ are the boss of up to 3 people. ### Why not call them a Junior/Associate Engineering Manager? You could, but the distinction here is that **we expect an engineer with this role to still perform his/her technical duties (i.e. coding, debugging, code review), albeit in a more limited capacity.** Without the _Technical_ adjective, we leave the opportunity to let those technical skills atrophy in certain organizations. ### Why the need to combine both duties? Won't that overwhelm the individual? At some point in your career, you'll be faced with the decision of considering roles in management. In many organizations, the leap to engineering manager is a large one. Make no mistake: **engineering management is a completely different skillset from software engineering and should be considered hitting the "reset button" on your skills.** With such a stark contrast in roles and responsibilities, the transition to management at the most junior levels could dissuade people from considering this career move in the first place. Nonetheless, leaders need to groom the next generation of leaders underneath them. To ease the transition and best prepare their future leaders with the necessary skills, consider looking at this hybrid role. ## What does a Technical Lead Manager do? At its core, a Technical Lead Manager codes _and_ manages the career of up to 3 direct reports. Since management is generally the newer of the two skills, we won't focus too much on how to develop the technical skills. For further reading on this topic, consider [this primer](https://www.getclockwise.com/blog/the-new-tech-leads-survival-guide) if you've never been a tech lead before. Instead, let's dive into the core skills you're going to have to build from scratch. ### The Four Dials of Engineering Management When I first became an engineering manager, my CTO gave me this analogy to think of engineering management skills _like knobs on a guitar amp_. For EMs, there are four dials: **people, process, product, and prowess**. Each dial is turned up somewhere between 0 and 10 depending on the organization you work in. Some companies highly value technical prowess, while other companies see EMs as people managers with a previous technical background. Regardless of your company's organizational design, you're going to want to know what all four knobs are so you can dial in the exact volume of each skill. #### People Let me cut right to the chase: **successful engineering management is all about maximizing the effectiveness of your people.** If you take nothing else away from this guide, know that _simply caring about your people will get you 80% of the way there_. To effectively care for your people, this dial itself has 4 core aspects: ##### One-on-ones (O3s) These are the bread and butter of effective management. **Consider them the most sacred time of your entire week.** O3s build trust, shared understanding, and rapport. From all of the materials I've read on this subject, here's a formula for the most effective O3s: * Have them every week with all of your directs for 30 minutes. * If you have a conflict, move your O3 but _never cancel it_. * Start each O3 with "how are you?" or "what's on your mind?" * At first, allot all 30 minutes for your directs to say whatever is on their agenda. They may have a lot on their mind in getting to know you so indulge them. * At some point (around a month in), you'll want to shift it to be 15 minutes for their agenda followed by 15 minutes for your agenda. * Finally, another month or so in, you'll want to trim a few minutes off of your agenda to bake in coaching & goal setting for your directs' career development. I still use this formula for all of my O3s and it has served me very well. **Be an [active listener](https://www.youtube.com/watch?v=saXfavo1OQo) during their agenda.** If you're looking for what to ask during your agenda, I've referred to [this list](https://larahogan.me/blog/first-one-on-one-questions/) for seminal O3 questions to ask your direct reports. For coaching and goal setting, keep scrolling. But first, you're going to want to make feedback a central tenant of your agenda. ##### Feedback People crave feedback. **Being an effective manager means being an astute observer.** There are lots of ways of observing your team without conjuring up an image of the nosey micromanager. You're probably doing a few of these already: * Active listening in meetings * Addressing blockers in stand-ups * Reviewing code * Pair programming Plus countless more. The difference now is you're not just observing with the lens of a software engineer but also as a manager with a chance to provide feedback. **Feedback is about behavior.** Let me repeat myself: feedback is _not_ about changing people but about changing behaviors. "You're bad at your job" is not feedback. It is an ad hominem attack. Feedback is either about _encouraging a behavior (positive)_ or _stopping a behavior (constructive)_. Feedback, too, has a formula for being given effectively. The one I like to use is based on [nonviolent communication](https://en.wikipedia.org/wiki/Nonviolent_Communication). The formula, like seemingly all of my formulas, occurs in four steps: 1. **Describe the current behavior.** Example: "When you test your code..." 2. **Describe the outcome of that behavior.** Example: "...it increases the reliability of our product." 3. **Describe the effect that behavior has on you or others.** Example: "This instills trust in our customers that we build software that works." 4. **If the feedback is positive, use encouraging language to reinforce the behavior. If the feedback is constructive, describe how you would like the behavior changed in the future.** Example: "Keep it up!" Stitching this together, our positive feedback example looks like this: > When you test your code, it increases the reliability of our code. This instills trust in our customers that we build software that works. Keep it up! Conversely, an example of constructive feedback would look very similar, with the largest difference being the final fourth step: > When you are late to meetings, it reduces the time we have to get through all of our agenda items. This inconveniences the team and reduces their trust in you to be reliable and timely. Please be on time from now on, thanks! Study this last example. There are a few things at play here that you should pay attention to: * The format for both positive and constructive feedback is the same. This conditions your team to see all feedback as a focus on behavior and not a personal attack on who they fundamentally are as human beings. If you keep all feedback formats the same, they won't cower at the thought of receiving negative feedback, and they won't have to worry about whether you're going to praise them or bring them down. * For constructive feedback, it's about the future, not the past. We can't change the past, so there is no reason to dwell on it. Instead, accept that the undesired behavior happened; don't dwell on it or berate your teammates for their shortcomings. All we can do is stop undesired behavior from happening again by steering the rudder in the right direction. If you follow these guidelines, your team will think of all feedback, positive or constructive, as a window into effective reinforcement or behavior change. **Finally, I would recommend you wait to introduce feedback until you've had about 8 O3s (and start with only positive feedback).** You need enough time to build rapport with your directs before you begin to give them feedback. And even then, you want it to start positive. Add on another 8 weeks (16 in total) before introducing constructive feedback so they are used to your communication style. ##### Coaching Once you are having regular meetings with your directs and giving them regular feedback, it's time to start coaching them in their careers. Coaching can take many forms, some of which I've already mentioned before: * Pair programming * Code review * Mentoring * Career development plans I use all of these tools regularly and you should feel free to use all of them. **The point of coaching is to get more out of your directs.** O3s build trust. Feedback is the transparency that results from that trust. Once you have transparency, you now know what your directs are capable of. To be a next-level manager, you need to strive to extract more value out of your directs by pushing them to be their best selves. How do you get more out of your directs? **Reinforce their strengths and bring up their weaknesses.** Do they write amazing tests? Have them review others' code for sufficient test coverage or run a workshop on effective testing strategies. Conversely, if they constantly forget to test their code, consider introducing a pull request checklist to remind them to include tests and enforce that all code comes with tests to merge code into production. You're going to have to get creative at this step because there are [boundless ways to coach someone](https://knowyourteam.com/blog/2020/02/19/how-to-coach-employees-ask-these-1-on-1-meeting-questions/) through their career. With all of this new work to do, how are you going to find time to still code? ##### Delegation The only way to find time for everything is to lean on your directs. **Delegation is the skill of pushing down responsibility to others.** I like to think of this with an analogy: Imagine the CEO of your company. He or she is likely the only person in the company with the knowledge of the entire ecosystem of how your company operates. They're also likely the only person who can both sell the product to investors and create a grand vision of how your company can wow your customers. The value of these two skills let's say is $500/hour. Now imagine a paralegal at your company. Paralegals often review documents, perform research, and write legal papers on behalf of the lawyers they work with. Let's say these skills are valued at $35/hour. Your CEO may have a law background and is quite capable of performing the duties of the paralegal. But if the CEO can do the legal work _and_ raise money from investors, while the paralegal can only do the legal work, why would you have the CEO ever do legal work? It's in the best interest of the company for the CEO to focus solely on the work he/she is uniquely suited to perform. **In your case, you are uniquely suited to manage and tech lead. If you are doing _anything_ outside of those two things, you are reducing the effectiveness of your position and should delegate that responsibility to others.** Delegation is the hardest part about being a TLM (or any manager for that matter). You are naturally going to want to do what you're comfortable doing: coding. Fortunately for you, coding is no longer your highest value skill. **At every possible moment, you should evaluate whether the task you are performing could be done by someone else.** If it easily can, document the work involved and share it with your team to perform. If it cannot, then it's probably something you need to continue to do yourself. This self-evaluation is the basis for the next dial of engineering management: process. #### Process If engineering management centers around people, then Process is the mechanism for making the People dial as efficient as possible. **Process is all about documentation.** If you have a repeatable, verifiable way of doing something, you should document it for your team or others to perform. Fun fact: this post is a manifestation of my process. Teaching others how to be engineering managers is a process. Reading this post is the process by which engineering managers can learn about how to manage for the first time. I know this is all getting pretty meta, so let's focus on a few ways to institute process: ##### Leverage a wiki tool Tools like Confluence or GitHub Wikis are a great way to codify processes. The question I always ask myself is: > Do other people need to know this and will forget it? I should write this down in the wiki. I mean all of the time. Even the most seemingly mundane tasks are documented in our Confluence at work. **Never assume a task or process is too dumb to spell out in your documentation tool.** As we gain experience in the field we assume a lot of things about our work. Not everyone has the same experiences or background as us, so what may seem trivial to you could be quite complex to others. That's why it's always a safe bet to write it down in the docs. At worst, you took some time to become a better writer and communicator. At best, your documentation with the whole company and adds value to everyone around you. So why wouldn't you write it down? ##### Use it as a teaching moment Looking for ways to incorporate coaching into your workday? Describe a process to a peer or direct. Gather feedback on your process: what works? What doesn't? What could be improved? Processes should help make everyone around you more effective. If you can teach something to others, you're building your skills as a coach and adding value to the organization. ##### Simplify your life The beauty of processes is once it is set in stone, you pretty much never need to think about it again. Processes turn algorithmically intense into the automatic. For example, if you didn't have a process for running effective meetings, you probably spent a lot of cognitive effort in figuring out how to set the agenda. Once you have a process, though, planning meetings become quite rudimentary and painless. You have a plan for how you're going to get through them. Processes make your life easier, and your life is already challenging enough. #### Product One of the cool parts of moving into management is the cross-disciplinary skills you'll develop. **Get ready to become best friends with your product manager.** I naturally consider myself to be a product-minded engineer. If this doesn't come naturally to you, it will be thrust upon you whether you like it or not. As the engineering manager, **you are now the liaison to other departments about the technical capabilities of your team.** Think of yourself as the head consultant and advisor to the rest of the company about your team. What is the makeup of the skills on your team: are they front-end or back-end focused (or full-stack)? Who loves tackling bug tickets and who loves fixing tech debt? How will you balance tech debt and bugs with new features? Who decides whether a bug is more important than a feature? These and countless other questions will arise when you turn up the Product dial. **Your best defense in dialing up Product is to be on the offensive, and that means getting to know your product and your product managers.** Just like your directs, set up O3s with your product manager. Get to know them and what makes them tick. Build trust. The only way you can build great software without you and your PM at each other's throats is to create an authentic relationship. **There will usually be a healthy tension between product and engineering and that's okay.** Sometime you'll want to address mounting tech debt in your end-to-end tests. Meanwhile, your PM will wonder why Widget X wasn't delivered yesterday. You will each have reasonable justifications for why your stance is correct. Guess what? At the end of the day, you're going to have to come to a consensus. **See my earlier points about active listening, nonviolent communication, and building trust.** Oh, and by the way, this extends far beyond Product. You'll have conflicts to resolve with architecture, design, marketing, sales, legal, and virtually anyone who comes in contact with your team: _including the team itself_. One engineer will say that Vim is the best IDE. Another will say it's Emacs. Spaces vs tabs. Linux vs Mac. It's pronounced "gif" (no it's "jif"). You name it, someone will have an opposing opinion. **You are mediator first, moderator second, and arbiter last.** What I mean by this is in any conflict, your first job is to _help each side empathize with the other_. Empathy, often confused with sympathy (which is about feeling someone else's distress), is about walking a mile in another person's shoes. Can you get Person A to see where Person B is coming from, and vice versa? If not, move into the moderation phase. Moderation is about guiding the discussion to help the two parties address their concerns and air their grievances. This guidance should hopefully lead to automatic mediation: by moderating the discussion, you're allowing both parties to see where the other is coming from in a way you couldn't immediately achieve in your first pass. But what if consensus should not be reached? As a last resort, it's time to be the judge, the jury, and the executioner. Ideally, people come to their conclusions themselves. The advice I've always latched onto is > Your best ideas seem like they never came from you at all and instead came from the person you were trying to convince in the first place Sometimes you don't have that luxury. In that case, you have to make a call for others. An often-quoted maxim of the military is **a poor decision is generally better than indecision.** Dwelling on whether you are making the right call in this argument and coming up with nothing is worse than just picking one side. This Product dial morphed into conflict resolution but is an important aspect of the natural tension that will arise when you are no longer interacting solely within the discipline of engineering. #### Prowess Finally, we can't forget what got you here in the first place: your technical prowess. For TLMs, coding, tech leadership, and review should still be anywhere from 50 to 80% of your duties depending on the organization you are a part of. In my first formal engineering management role, I also played the role of a tech lead and spent nearly 80% of my time coding. The further up the food chain you go, the less you will be coding. **Do whatever you can to keep that number above 0%, even if that means coding after work.** Why? **No one wants to work for someone who can't remember what it's like to be in the trenches.** The engineering managers I've respected the most are those who can still tango with the code. Like the CEO and paralegal analogy from above, you _could_ code, but you're choosing to delegate some of those tasks because a portion of your time is better spent managing. I find that my technical prowess is best utilized through code review and coaching. Writing out a folder structure for React components is suboptimal compared to teaching a room full of engineers on how to organize their React projects. **You only have two hands. Every teaching opportunity is a chance for you to give yourself more hands.** **Systems Design becomes the primary skill** you'll be building in your technical dial. Now that you can't spend all of your time programming, you need to be thinking at a higher level as to how your systems will connect to operate in an efficient, scalable, and reliable manner. [I love this Systems Design playlist on YouTube](https://www.youtube.com/watch?v=quLrc3PbuIw&list=PLMCXHnjXnTnvo6alSjVkgxV-VH6EPyvoX). Not only is it a good source of Systems Design interviews to implement in your hiring program, but it will build your skills when you inevitably interview somewhere else and want to demonstrate you know how to logically think about systems. That playlist will help you get the reps in and build this skill quickly. Do yourself a favor and watch all of the videos on this list. ### Other duties of the TLM you should know about If you stick to mastering the Four Dials of Engineering Management and maintain your skills as a tech lead, you'll find the Technical Lead Manager role to be very rewarding and easy to transition into. Depending on your companies needs, there are a few other duties I want to call out that you should be aware of that EMs also do so you aren't blindsided: * **Hiring.** When the money is rolling in, you're inevitably going to be growing your team. You're not only going to be on the other side of the interview, but you may not be interviewing for just technical skills. Oftentimes engineering managers assess things like culture fit and soft skills. I'll save this one for another post, but for now, I would just Google how to run a cultural interview because there are plenty of dos and don'ts you need to consider to keep your interview above board. * **Firing.** With great power comes great responsibility. As much fun as it is to bring in great people, it is equally crushing to let go of underperformers. Performance management is all about retaining great talent and coaching your direct reports to be the best they can be. Sometimes that's just not enough. The key here is to **document everything and ensure you have a strong case for why this person cannot be saved.** This cannot be a gut feeling. You need plenty of documentation of continued underperformance and failure to adopt feedback before you should consider things like PIPs (Performance Improvement Plans). When in doubt, check with your HR or People team on how to handle these scenarios. * **Performance Reviews, Compensation, and Promotion.** If you have great people and you are following the Four Dials of Engineering Management to get the most out of your team, you're putting them on the path to upskilling them. They will recognize their metamorphosis and demand more pay or a title bump. Like the previous point, these merits cannot be based on gut feelings alone. Instead, _help your directs build an ironclad case for why they should be recognized_. Someone will ask you to justify why your direct deserves a promotion and a bonus. If you can delegate the rationale to your direct, not only will they be able to justify their raise, but so will you. ## Conclusion and Further Reading This post is long enough (and I'm tired enough) that you should be well on your way to a successful trial at engineering management. I want to thank [this post](https://randsinrepose.com/archives/meritocracy-trailing-indicator/) for inspiring me to consider Technical Lead Managers as a great bridge to delegating management responsibilities to your team. All of these tools and strategies were tried and tested by me over the last several years as an engineering manager. All of them are backed up by the amazing engineering management books, podcasts, and blogs I've read over the years. An exhaustive reading/listening list can be found [here](https://leadership-library.dev), though I'd like to call out a few pieces I've found especially useful over the years: * _The Effective Manager_ by Mark Horstman (and the accompanying _Manager Tools_ podcast) * _Resilient Management_ by Lara Hogan * _High Output Management_ by Andy Grove * _The 27 Challenges Managers Face_ by Bruce Tulgan (particularly for conflict resolution) * _Accelerate_ by Nicole Forsgren, Jez Humble, and Gene Kim * _An Elegant Puzzle_ by Will Larson * _Extreme Ownership_ by Jocko Willink (and his _Jocko Podcast_) * _The Making of a Manager_ by Julie Zhuo * _The Manager's Path_ by Camille Fournier * _The Talent Code_ by Daniel Coyle I've read all of these books and are the basis for how this guide was built. If you want to learn more about anything discussed in this guide, I would encourage you to read the source material above. Let me know what you found most helpful or if you'd like clarity on any of this [on Twitter](https://twitter.com/theadamconrad) and I'd be happy to walk you through any of the finer points. Best of luck and remember that **management is a skill just like anything else**. With a combination of study and practice, you'll be well on your way to greatness! ]]> 2020-02-27T23:00:19+00:00 The teamwork analogy I use nearly every week https://www.adamconrad.dev/blog/the-teamwork-analogy-i-use-nearly-every-week/ Sun, 12 Jan 2020 22:06:12 +0000 https://www.adamconrad.dev/blog/the-teamwork-analogy-i-use-nearly-every-week/ No one ever partially IPO'd. The NYSE isn't going to have you ring the bell because you're a rock star programmer while the rest of the company fails. Either we all cross the finish line, arm-in-arm, or we all fail to complete the race. So many people fail to see their petty arguments with their fellow employees are completely fruitless. What is the point of winning your argument if it fractures your team? Companies are more like baseball than tennis; it's not an individual sport. One or two stellar performers cannot possibly make up for a losing team. That's why I don't hire divas or prima donnas. Sure, the rock star may be exceptionally talented and even a "10x" programmer. But if their 10x causes everyone else to drop 50% in productivity because their code is exceptionally clever or they are overly critical in code review, then the team overall is still no better off in terms of productivity. **If you want to be selfish, it's in your best interest to be a team player.** You read that right, if you want to make that sweet IPO money the only way you are going to get there is if your whole company gets there. It's in your best interest, therefore, to help others out so you can all get to that exit event sooner rather than later. You only get there after the last person crosses the finish line, and you're all linked arm-in-arm. _That's_ why you stand to benefit if others benefit. The next time you're ready to rip someone's throat out for a bad move or you think your marketing team is useless, remember that every piece of the puzzle has to fit if you want to win. Otherwise you'll stay stuck in the same rut and no one wants to work with someone who is difficult and not a team player. ]]> 2020-01-12T22:06:12+00:00 Always have the Beginner Mindset https://www.adamconrad.dev/blog/always-have-a-beginners-mindset/ Fri, 13 Dec 2019 12:19:06 +0000 https://www.adamconrad.dev/blog/always-have-a-beginners-mindset/ 2019-12-13T12:19:06+00:00 Running Effective Planning Meetings https://www.adamconrad.dev/blog/running-effective-planning-meetings/ Mon, 04 Nov 2019 13:17:19 +0000 https://www.adamconrad.dev/blog/running-effective-planning-meetings/ 2019-11-04T13:17:19+00:00 My most referenced articles https://www.adamconrad.dev/blog/my-most-referenced-articles/ Tue, 08 Oct 2019 22:51:57 +0000 https://www.adamconrad.dev/blog/my-most-referenced-articles/ 2019-10-08T22:51:57+00:00 Building psychological safety from scratch https://www.adamconrad.dev/blog/building-psychological-safety-from-scratch/ Tue, 03 Sep 2019 22:51:57 +0000 https://www.adamconrad.dev/blog/building-psychological-safety-from-scratch/ 2019-09-03T22:51:57+00:00 Anticipating needs is a great way to manage up https://www.adamconrad.dev/blog/anticipating-needs-is-a-great-way-to-manage-up/ Mon, 19 Aug 2019 12:19:06 +0000 https://www.adamconrad.dev/blog/anticipating-needs-is-a-great-way-to-manage-up/ Extreme ownership is owning up to everything within your control; the good _and_ the bad. Your boss is concerned your team is going to miss the deadline. You believe it is because there is a developer who isn't pulling their weight. Wrong. When you do a root cause analysis of why the developer isn't pulling their weight, you can always trace it back to _you_. You didn't set expectations for delivery on the deadline. You didn't check in every week to see if all team members were up to speed and unblocked. You didn't provide training or education for those falling behind who needed more assistance. If something goes wrong within your org then there is always more you can do to improve the outcome. I have heard this advice paraphrased many times since becoming a manager and it is that your team gets credit for all of their success and you credit credit for all of their failures. When your team launches a big app you congratulate your team's tireless efforts in front of the whole company. When a security vulnerability loses you a customer then you take one on the chin in front of your senior vice president. This is the price you pay for moving up the management ladder. ## The key to moving up Beyond staying humble the real reason is around your position. If you run a team by definition you're already celebrated for being a leader with your title and salary. What more do you need? **You need stellar employees who like working for you.** In order to maintain that title, salary, and status, you need to be seen as a leader worthy of more directs and more responsibility. The only way to do that is to enable successful teams who like working for you. No one wants to promote a leader to a senior leader who can't retain their employees or get the job done. And the only way to do that is to make your teams awesome. One _really_ easy way to help do that is to take flak when things go wrong and prop them up when they succeed. ## An example: organizational design One night I was curious about how my teams would [scale](https://lethain.com/sizing-engineering-teams/) over the next year or two so I played around with Excel for maybe 20-30 minutes. It wasn't a lot of time and I had a mental itch I needed to scratch. The next day in my 1-1 with my manager I decided to show him both my short-term team plan over the next 3 months as well as the 1+ year plan. We had nothing on our agenda about this and we won't even begin to do budgetary talks until the end of the year. Even with all of those considerations my manager was happy to see it. The real eye opener was that a week later in standup with his direct reports (all directors), he announced all teams would begin to work with him on how their teams would evolve over the next several months/years. So you tell me: who has the leg up in the conversation? Who spurned this idea? If you're thinking it's me because of my proactive work with the 1-1 I'd say you're right. While I can't confirm for sure, it is awfully circumstantial that a week later my topic of long-term organizational design was to be a central topic for all of my peers to my manager over the next several weeks. That was the singular moment when I realized the power of anticipating needs and being ahead of the pack is a powerful force in your career. ## Armchair quarterbacking and moving forward Would I do anything differently? **I would have started sooner** so hopefully I can make that happen for you. Moving forward, this will continuously be a part of my duties particularly when I have more room on my plate for additional capacity. Even if it doesn't revolve around your manager's needs you should try and help others because it's the right thing to do. If you're floating around the pool and someone else is drowning it's only instinct to jump in and help them out; this should be no different while you're working. We should all be doing more to help others when we are in a position to assist so go ahead and do it! ]]> 2019-08-19T12:19:06+00:00 A 90 day plan for managers https://www.adamconrad.dev/blog/a-90-day-plan-for-managers/ Wed, 24 Jul 2019 12:19:06 +0000 https://www.adamconrad.dev/blog/a-90-day-plan-for-managers/ 2019-07-24T12:19:06+00:00 How to justify your cost to a client https://www.adamconrad.dev/blog/how-to-justify-your-cost-to-a-client/ Thu, 06 Jun 2019 21:50:40 +0000 https://www.adamconrad.dev/blog/how-to-justify-your-cost-to-a-client/ **TLDR: Price yourself in a way that _feels_ low to the client even if it is high in absolute terms.** If you're new to freelancing or [consulting](https://anonconsulting.com/) the advice that gets passed around a lot is to not charge by the hour because then you're making yourself a commodity. That instead you should use [value-based pricing](https://training.kalzumeus.com/newsletters/archive/consulting_1). In software, this is flawed for 2 reasons: 1. **Most software gigs are longer-term.** Marketers and designers can easily charge value-based pricing because their projects or campaigns are extremely finite. A campaign or a logo are just that - you might iterate a few times to get what you want but once you get the deliverable that's it. Even if you do a shorter 4-6 week engagement as a software developer you will inevitably have a bug or security update that will require maintenance after the engagement is over; it just doesn't "end" when you deliver the code. 2. **Most companies don't price software products; they price developers.** Companies don't hire freelance designers, they buy logos. Companies don't hire marketers, they buy campaigns. But _everyone_ hires software developers. They don't want to buy a SaaS app from you. They want to buy your time. So until the industry moves away from buying your time, the vast majority of clients will still value you as a dollar-per-hour commodity. I know this having tried to win plenty of clients over with this approach. I also know this now having working at a company that hires lots of contractors. So let's dive into the 5 Whys to see if we can find a root cause and a solution. * **Why doesn't value-based consulting work?** Because companies still buy a software developer's _time_ and not what they _produce_. * **Why can't we sell software and not ourselves?** Because software is a blurred line. Unless you are expected to write the entire application for a client what is your code versus someone else's is very hard for the client to define. * **Why can't we clearly define the products we build?** Because software is built with teams. If you are trying to find clients chances are you are not their first (or their last) developer. Software is very difficult to get done as a solo project and is rarely done in a vacuum. Software is a team sport. * **Why is software a team sport?** Because it's super hard! * **Why is software hard?** Because we are trying to map the decisions of the human mind onto a flat glowing metal screen. How easy could it possibly be to describe the 300 billion neurons of the human brain? So software is really hard to do on your own and therefore hard to distinguish your software product from the rest of the team you are joining as a contractor. Does that mean you should throw in the towel and give up? **No way!** Yes, freelancing and consulting is super hard and unless you have the stomach to sell yourself and get lots of rejections you are going to have a bad time. But there is one thing you can do to overcome the hurdle of a client justifying the cost of going with a contractor: **convert your desired value-based outcome into an hourly rate.** ## Estimates are your saving grace The key to make this all work is to work backwards from your ideal to a realistic estimate. This is best illustrated with an example. Suppose you have a project that you think is worth $100,000. You want to tell the client the charge is $100,000 because of blah blah something valuable. Now ask yourself how long do you think this will take? 10 weeks? That's about 400 hours or $250/hour which is pretty dang good. The problem is that no one is going to pay $250/hour for a contract developer. Unless you work for a prestigious, well-established consulting firm, you are going to have a hard (if not impossible) time finding clients that will hire independent consultants at that rate. **Here's the key:** if you tell your client you charge $100/hr now they're listening. It just means to get to that rate you would theoretically need to do 1,000 hours worth of work. **But your client doesn't care if it takes you 1,000 hours or 400 hours if they agree to pay you to complete the project in the time you've told them it would take.** So _if you end up only taking 400 hours to complete the project_ you've upped your hourly rate to the desired rate for your original project _and_ the customer gets their project 60% faster than they thought they would have. You think they are going to complain if their critical project arrives early? Isn't that [inflating your hours](https://en.wikipedia.org/wiki/Parkinson%27s_law)? On the contrary, there is an [entire law](https://en.wikipedia.org/wiki/Hofstadter%27s_law) devoted to a programmer's hubris in estimating software. Your ideal hourly rate is probably too optimistic. A heavily-inflated hourly estimate is based on something like Hofstadter's Law to compensate for the things you don't know. In the end, your actual hours billed in this example will probably fall somewhere in the middle around 700-750 hours because of things you failed to account for. Either way, you're still making more than the hourly rate you billed and your client still gets their project delivered on time or earlier. Next time your client balks at your price remember Hofstadter's Law and buffer in quite a bit of extra time so you can land the project early, make more per hour, and use the rest of that time to find more clients and bill more hours. ]]> 2019-06-06T21:50:40+00:00 A plan for finding clients quickly https://www.adamconrad.dev/blog/a-plan-for-finding-clients-quickly/ Thu, 06 Jun 2019 21:50:40 +0000 https://www.adamconrad.dev/blog/a-plan-for-finding-clients-quickly/ 2019-06-06T21:50:40+00:00 Build your network now https://www.adamconrad.dev/blog/build-your-network-now/ Tue, 02 Apr 2019 22:49:10 +0000 https://www.adamconrad.dev/blog/build-your-network-now/ Dig the well before you get thirsty Most people think of networking as a _reactionary_ activity. You network with people at meetups because you need something from someone. No wonder people hate to go to meetups! **Networking should be a _proactive_ activity.** You _offer_ help and connections. You build people up and bring them together _just because you can_. What does this get you? If you've asked that question you've already lost. Stop only thinking about yourself. **No one wants to network with selfish people.** The question you should be asking is: > How can I make the world a better place for my fellow humans? When you give back, you get more in return. And I don't mean this in the touchy-feely validating kind of way. I mean real benefits when your contract runs out and you need more money NOW kind of benefit. Imagine your friend is looking to start a company. Imagine your other friend is looking to start a company. You see two people and one opportunity to actualize. In that moment it probably has no material impact on your life. But in the future when you are looking for a job and those two are now hiring, who do you think they are going to ask to fill their top spot? Who do you think will have their own network of folks looking for freelancers to help build the MVPs for their startups? If you're thinking this imaginative scenario actually happened to me, you'd be right. I actually helped connect my friends together to start a successful startup. It did nothing for my wallet for years. But eventually I ended up working for their company. And years after that they helped me land some key clients when I went into business for myself. And we're all still friends! This stuff works. But it only works if you live with an abundance mentality that **it's not all about you**. If you only network when you're desperate and need something, you're going to have a bad time. Which is no wonder why everyone fails at it and thinks it's scummy and no fun. If, instead, you **make networking about genuine, authentic connection**, you're apt to get everything you could ever want out of networking and way more than you bargained for. I may be a sample size of one but my hope is that you'll add to that list of anecdotes as well. ]]> 2019-04-02T22:49:10+00:00 When to use Scrum or Kanban https://www.adamconrad.dev/blog/when-to-use-scrum-or-kanban/ Wed, 06 Mar 2019 12:05:49 +0000 https://www.adamconrad.dev/blog/when-to-use-scrum-or-kanban/ 2019-03-06T12:05:49+00:00 How to demonstrate impact https://www.adamconrad.dev/blog/how-to-demonstrate-impact/ Sun, 10 Feb 2019 23:54:49 +0000 https://www.adamconrad.dev/blog/how-to-demonstrate-impact/ 2019-02-10T23:54:49+00:00 Teamwork makes the dream work https://www.adamconrad.dev/blog/teamwork-makes-the-dream-work/ Wed, 09 Jan 2019 22:44:58 +0000 https://www.adamconrad.dev/blog/teamwork-makes-the-dream-work/ 2019-01-09T22:44:58+00:00 Ending Our Algorithmic Journey by Proving NP-Hardness https://www.adamconrad.dev/blog/np-completeness/ Thu, 22 Nov 2018 20:17:00 +0000 https://www.adamconrad.dev/blog/np-completeness/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/hard-reductions/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Answers to the previous Daily Problem > Show that the following problem is NP-complete: for a graph `G` with integers `k` and `y`, does `G` contain a dense subgraph with exactly `k` vertices and at least `y` edges? When I look at this problem I instantly think of the tree of reductions that we went over the last few articles. So right away I'm thinking of the graph-specific configurations: independent set, vertex cover, and clique. The key in choosing these three is in the word _dense_...which of the graph types in the reductions list deals with dense configurations? My first instinct is clique because of how we want every vertex to touch every other node (think back to the example of the pentagram in the pentagon). So if we can reduce a dense subgraph into a clique via a reduction, and we know that a clique is part of the NP-complete hardness tree, then we have our solution. A clique is a fully-connected graph. So if the dense subgraph is a subset of the connections in a clique, your dense subgraph generalizes to clique and therefore the graph is also NP-complete. This is called a [K-clique densest subgraph](https://en.wikipedia.org/wiki/Dense_subgraph) if you want to know more about the formal proof. ## Nearing the end: tips for proving hardness We've come a _long_ way in our journey in understanding data structures and algorithms. At this point we're entering territory where programming languages no longer matter; these theoretical algorithms have no real solutions so we can't map them to code samples. Whether you're studying this for fun or for interviews you may reach this final stage where you need to understand how to solve for uncharted territory. Here's a few tips on proving algorithmic hardness: ### Choose a good, simple NP-hard graph The first and most important thing is to identify the right kind of NP-complete problem. Knowing the chart of problem translations is extremely handy in these situations. Here's a few based on how much you think you can fit in your head: * [A simple one](https://upload.wikimedia.org/wikipedia/commons/8/89/Relative_NPC_chart.svg) * [A bigger graph](https://www.edwardtufte.com/bboard/images/0003Nw-8838.png) * [The motherload of current hard problems](https://adriann.github.io/npc/dot.png) If you look at the Daily Problem, much of the effort came down to choosing the right source problem. Clique was clearly the right choice for K-clique densest subgraph. But even then, you could still find a reduction using vertex cover or 3-SAT, it would just be harder to make the translation from the source to the destination problem. For most of these problems, you pretty much only ever need to **start with 3-SAT, vertex cover, Hamiltonian path, or integer partition**: * Worried about order or scheduling like a calendar? Start with _Hamiltonian path_. * Dealing with selection and filtering? Start with _vertex cover_, though _clique_ and _independent set_ are obvious offshoots if you can see the distinction. * How about combinatorics or large numbers? _Integer partition_ is what you want. * For everything else? _3-SAT_ is at the top of the NP-complete food chain and is extremely versatile. ### Constrain the problem as much as possible Constraints are good. An undirected graph becomes a directed graph. An unweighted graph adds weights. Or we can flip it: we did this when reducing TSP into a Hamiltonian cycle by turning weighted edges into unweighted ones, equalizing the cost of each edge for a Hamiltonian cycle. Your constraints should inject harsh penalties for deviating away from your reduction. When we reduced Integer programming into 3-SAT, the maximization function and target value all became inconsequential. To ensure we didn't let that get in our way, we had to neutralize the maximization function to the identity function and set the target to `0`. Any other set of values would make it more difficult to prove. Once you have the constraints, _force a decision_. Make it so that the choice is between A or B and never A _and_ B. ### Reductions are easier than algorithms but keep your options open When your head is in the NP-complete or NP-hard space you automatically assume that the problem is unsolveable with an algorithm. However, non-polynomial problems in the exponential space can still be solved with techniques like dynamic programming to bring it back down into efficient polynomial space. In other cases, you can use approximation algorithms with heuristics to turn NP-hard problems into polynomial-time solutions. Maximum clique, minimum vertex cover, and the Knapsack problem all benefit from approximation algorithms to arrive at a relatively correct but efficient result. ## Final thoughts This project of auditing the class for _The Algorithm Design Manual_ was truly a labor of love. Learning by teaching is a great way to really cement your understanding of the materials in this series. If you used this to help you understand data structures and algorithms for an upcoming interview, I want to leave you with a selection of problems over the whole book as sort of a final exam to test your knowledge. These problems are specifically designed for those interviewing at companies, so if you can tackle these, you'll be in a great spot. Good luck and thanks for sticking with me! 1. Problem 1-29. 2. Problem 1-31. 3. Problem 1-33. 4. Problem 2-43. 5. Problem 2-45. 6. Problem 2-47. 7. Problem 2-49. 8. Problem 2-51. 9. Problem 3-19. 10. Problem 3-21. 11. Problem 3-23. 12. Problem 3-25. 13. Problem 3-27. 14. Problem 3-29. 15. Problem 4-41. 16. Problem 4-43. 17. Problem 4-45. 18. Problem 5-31. 19. Problem 7-15. 20. Problem 7-17. 21. Problem 7-19. 22. Problem 8-25. ]]> 2018-11-22T20:17:00+00:00 Harder Intractable Reductions https://www.adamconrad.dev/blog/hard-reductions/ Thu, 22 Nov 2018 20:17:00 +0000 https://www.adamconrad.dev/blog/hard-reductions/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/easy-reductions/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Answers to the previous Daily Problem > Suppose we are given a subroutine which can solve the traveling salesman decision problem in, say, linear time. Give an efficient algorithm to find the actual TSP tour by making a polynomial number of calls to this subroutine. First we need to find the length `l` of a shortest TSP tour. Then we just go over every edge on the graph and see if, when we remove that edge, our shortest TSP tour is still `l`. If it is, that edge wasn't part of the path so we remove the edge completely. The end result is a graph of only edges that build the `l` length of that shortest path. We only made a linear number of calls for our TSP decider since we just went over a linear number of edges in the graph. ## Two examples of harder reductions Once again, a reduction is simply a way of turning one concept into another; _reducing_ it into a simpler/different problem. For more difficult reductions, we don't have the straightforward algorithms that we had from our previous post and will require a bit more creativity. ### 1. Integer Programming & 3-SAT I have to be honest, the rules for integer programming seem oddly arbitrary and kind of contrived. Integer programming in the context of combinatorial optimization has to obey 4 rules: 1. Variables are integers 2. You've got some inequality constraints over those variables 3. A maximization function over some (or all) of those variables 4. An integer target you need to hit Again, it's best to use JavaScript to provide a concrete example: ```js const sampleIntegerProgramming = (a, b) => { // constraint 1 if (a < 1) return false; // constraint 2 if (b < 0) return false; // constraint 3 if (a + b > 3) return false; const maximizationFn = b * 2; const target = 3; return maximizationFn >= target; } sampleIntegerProgramming(0, 0); // false - invalid constraint 1 sampleIntegerProgramming(2, 2); // false - invalid constraint 3 sampleIntegerProgramming(1, 0); // false - maximization misses target sampleIntegerProgramming(1, 2); // true - all rules satisfied ``` As you might expect, you can reach impossible situations. If you just change your `target` to a value of `5`, there is no possible set of integer arguments that would return a value of `true`. We can reduce 3-SAT into integer programming; given we already know 3-SAT is hard, and we want to prove integer programming is hard, our translation will start with the 3-SAT problem. There aren't as many silly rules for 3-SAT, but we can translate it pretty clearly: 1. Variables are integers, restricted to `0` (false) and `1` (true). 2. Inequality constraints, based on rule 1, are that all variables (and their negations) are between `0` and `1`. Further, to ensure truthiness for every expression, we need to make sure all variables are less than or equal to `1` when summed with its negation (which must be greater or equal to `1`). 3. The maximization function is inconsequential as we've already mapped out all of the rules for 3-SAT, so we can keep it simple and just map this function to the identity function of one of our variables. 4. The target value also doesn't matter so to make sure it never matters we set it to something universally achievable like `0`. With these four rules for IP described in the veil of 3-SAT, we've shown our creative translation. We know every 3-SAT can solve an IP because, under these 4 rules, every clause in 3-SAT must have an answer of `1` (true). The inequality constraints sum to ensuring we just have an answer greater than or equal to `1`. With a target of `0` we satisfy our target always since no 3-SAT problem can offer a result of false (`0`). Flipping it around to have any IP problem provide a solution to a 3-SAT problem, we can say that we know all integer variables can only be `1` and `0`. Any `1` is a `true` in the 3-SAT problem. Any `0` is a `false` in the 3-SAT problem. These assignments simply need to satisfy all expressions in a 3-SAT problem to ensure its validity. And just like that, we prove the reduction works both ways and shows that integer programming is a NP-hard problem just like 3-SAT. ### 2. Vertex Cover & 3-SAT We saw the definition of vertex cover in the [previous article](/blog/easy-reductions/) and it, too, can be reduced via 3-SAT. 1. The boolean variables in 3-SAT are now two vertices (one for your variable's value, one for the negated value) in your vertex cover graph connected by an edge. We'll need `n` vertices to ensure all edges are covered since no two edges can share a vertex. 2. Each expression in 3-SAT now needs three vertices (one for each variable in a 3-SAT expression). Each of those vertices form a triangle since those variables are linked in a given expression. That gives us a number of triangles equal to the number of expressions in our 3-SAT problem. Two of the three vertices in a triangle are needed to make a vertex cover, so you're looking at `2e` number of vertices where `e` is the number of expressions in your 3-SAT problem. 3. Construct a graph with `2n + 3e` number of vertices, joining the graph made in step 1 and the graph made in step 2. What this graph looks like is a wall of variables connected to their negations at their top, and an edge connecting them to their equivalent placement in their 3-SAT triangles. And with that, we translated 3-SAT into vertex cover. Think of this like a pie chart with a legend. Each triangle makes up the segments of the pie with a different color. Each color points to a value on the legend, saying what that color represents (either the variable or the negated variable). To show this translation is correct, we show, just like the previous example, that one problem solves the other (and vice versa). Every 3-SAT gives a vertex cover because when we have a set of truthful assignments for our expressions, we can find those variables in the "legend" within our vertex graph. Since those assigments are true, we know we are _covering_ one of the triangles. All of the other cross edges are covered by virtue of the fact that our "pie graph" is a triangle. The triangle configuration ensures coverability in one main portion, and the truthiness of our variables ensures we connect the triangles to the legend portion from step 1. Conversely, every vertex cover gives a 3-SAT because in a graph of size `n + 2e` (the necessary size of a vertex cover graph), we just need to account for all vertices. Per our earlier definition, we need `n` vertices to represent our legend of possible variables. These vertices can define our 3-SAT truth assignment. The remaining `2e` vertices have to be spread across each variable _and_ it's negation. In other words, we need to ensure that for a given 3-SAT problem, either you're connecting your cover to the variable used in the expression (e.g. `v1`) or its negation (e.g. `!v2`). If you have a vertex cover, you've got at least one cross edge covered per triangle (representing the 3-SAT expression). In other words, you're satisfying all expressions with a connection, or an evalution of `true` in 3-SAT for that particular expression. That's a mouthful, but we can go either way which means we have a reduction from 3-SAT to vertex cover! And with that, our *final* daily problem: ## Onto the next Daily Problem > Show that the following problem is NP-complete: for a dense subgraph `G` with integers `k` and `y`, does `G` contain a subgraph with exactly `k` vertices and at least `y` edges? ## More problems to practice on For harder, more creative reduction problems, check out Problem 9-13 from the homework set. ]]> 2018-11-22T20:17:00+00:00 Easy-ish Intractable Reductions https://www.adamconrad.dev/blog/easy-reductions/ Thu, 22 Nov 2018 20:17:00 +0000 https://www.adamconrad.dev/blog/easy-reductions/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/reductions/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Answers to the previous Daily Problem > We wish to compute the laziest way to dial given `n`-digit number on a standard push-button telephone using two fingers. We assume that the two fingers start out on the `*` and `#` keys, and that the effort required to move a finger from one button to another is proportional to the Euclidean distance between them. Design and analyze an algorithm that computes in time `O(n)` the method of dialing that involves moving your fingers the smallest amount of total distance. And you thought you could get rid of dynamic programming, didn't you? So we've got 10 numbers on the keypad plus the special characters. In other words, you're asking yourself "how many phone numbers can I construct with two fingers?" This breaks down into a bunch of subset of problems like "how many 10-digit numbers can I construct with my fingers on `*` and `#`?" which is a superset of the configuration where you slide your left finger up by 1 unit and pressed `7`, or "how many 9-digit numbers can I construct with my fingers on `7` and `#`". That is a superset of the configuration where you slide your left finger up by 1 again and pressed `4`, or "how many 8-digit numbers can I construct with my fingers on `4` and `#`." I think you see where this is going. To drill this into your brains: the key here is that we see this repetition and _use the previous results via a cache to speed up our program_. Thus, all of those 10-digit phone numbers that start with `7`, we remember we store "move up 1" as a way to construct a phone number starting with `7`. We never have to worry about figuring that out again for all other numbers starting with `7`. Because we're saving time by increasing the space, we reduce this algorithm down from something exponential to something linear. The algorithm, at a high level, looks like this: 1. Set a default value of `#` and `*` for your starting points. For a 1-digit or 2-digit number, the cost is simply the distance from that key to the new number for one finger (or two fingers). 2. For the remaining digits with a number range from `0` to `9`, store the cost in a 3D matrix `mtx[digit][fromLeft][fromRight]` where the first axis `digit` is the number of digits into the sequence you're in and `fromLeft` (or `fromRight`) is the successive score coming from a certain key on the keyboard given where your finger was previously placed. 3. Return the score of the matrix `mtx` for the `n` digits specified starting with the `*` and `#` keys. Since we're filling one cell at a time in the matrix, this is just a simple run of our `n` digits across a matrix of space `100n`, since there are `10n` cells for each finger. We can reduce `O(100n)` down to `O(n)`, which is a linear-time algorithm. ## Three examples of easier reductions To reiterate, a reduction is simply a way of turning one concept into another; _reducing_ it into a simpler/different problem. Here are a few more examples that we can extend upon from last time. ### 1. Hamiltonian Cycle & TSP We all know the Traveling Salesman Problem: getting from one point to another as fast as possible. The Hamiltonian cycle is a similar problem with two major distinctions: 1. The graph is unweighted (i.e. in TSP every vertex has a cost, but in HC those costs are all the same) 2. The goal is to tour all elements exactly once. TSP also visits every edge (to find the best path), but that isn't the goal Due to their similarities, we can express a Hamiltonian cycle as a reduction of a Traveling Salesman Problem: 1. First, make a copy of your Hamiltonian graph such that all vertices have the same weight 2. For every vertex against every vertex (an `n^2` loop), if the combo of these two edges expresses a vertex in your graph, set the weight of that vertex to be `1`. Otherwise, set the weight to `2`. 3. Use this newly-weighted Graph to decide if there is an answer to the TSP across all `n` of these vertices. We know this works because if we find a Hamiltonian cycle on this graph, we can trace a TSP solution that has translated across all valid edges. Since each edge is of weight `1` and we have to explore all `n` vertices, the cost of the TSP tour has to be `n`. Reducing that back into the Hamiltonian cycle graph which has no weights, it would only use the edges of weight `1` which is the cycle we're aiming to find. ### 2. Vertex Cover & Independent Set Still looking at graphs we see two more similar problems revolving around subsets of vertices in a graph. A _vertex cover_ is a subset of vertices in a graph that touch all edges. Conversely, an _independent set_ is a subset of vertices in a graph that don't touch each over via an edge. For certain graphs, coloring in the vertices of vertex cover look like the inverse of what an independent set would be. So, to handle the reduction, thus proving that both problems are just as hard to tackle, we've got a pretty straightforward algorithm: 1. Make a copy of the vertex cover graph 2. Make a copy of the number of vertices in your vertex cover subset 3. Find an answer to the independent set graph using your copied graph and vertex subset The important thing here is we transform the inputs, not the solution. In fact, we don't even need to know what the solution is. Simply by translating the inputs into independent set problem we are able to solve for the hardness of vertex cover. Oh, and to make things more fun: we reduced TSP into a Hamiltonian cycle, and a Hamiltonian cycle can be reduced into a vertex cover, just like an independent set. ### 3. Clique & Independent Set For our last examples we look at cliques in graphs. Yes, like social cliques of friends, graphs can form a clique which is to say that a graph where a vertex has an edge to every other vertex in some kind of subset of vertices. An example of a five-vertex clique would be the [pentagram](https://en.wikipedia.org/wiki/Pentagram#/media/File:Regular_star_polygon_5-2.svg) formed within a pentagon. Each corner of the pentagon forms a shape, connected the adjacent vertices. To solidify the clique, you now create edges to all of the other vertices that aren't part of the pentagon's outer shape. And what do you know, a clique is just a reduction of an independent set. 1. Copy your independent set graph, except the edges in this copy are the inverse of the edges in your independent set. In other words, if the edge `(i,j)` is not in your original graph, add that edge to your copy. 2. Find an answer to the clique graph with the same vertex subset and your new graph As we can see here, we can imply the hardness of clique by the hardness of independent set, and thus the hardness of vertex cover. Where does this chain of hardness reductions end? That's where satisfiability comes in. ### Satisfiability: the one reduction to rule them all NP-completeness derives all the way down to satisfiability. Any hard problem can be reduced down into a series of graph constraints like the three we saw above. Those problems all eventually reduce down to satisfiability. We can _satisfy_ a statement by saying at least 1 truth exists for every expression in a program. This is super easy to describe with JavaScript: ```js let v1, v2; v1 = true; v2 = true; // satisfied v1 = false; v2 = false; // satisfied v1 = true; v2 = false; // not satisfied const expr1 = v1 && !v2; const expr2 = !v1 && v2; return expr1 && expr2; // if this returns true, we're satisfied ``` In the above example we can find configurations for `v1` and `v2` that satisfy the final boolean expression. Certain problems, no matter how hard you try, cannot be satisfied: ```js let v1, v2; v1 = true; v2 = true; const expr1 = v1 && v2; const expr2 = v1 && !v2; const expr3 = !v1; return expr1 && expr2 && expr3; // never satisfied ``` Go ahead and try the above code in your console. No matter how you arrange `v1` and `v2`, you will never get your return statement to return a value of `true`. One of the elementary reductions from satisfiability (the parent to vertex cover) is 3-Satisfiability, or 3-SAT. This is just a specialized form of the above but every expression must contain _exactly 3 variables_. And since we know satisfiability is hard, we know 3-SAT is hard, too. This can extend for all numbers larger than 3, so a statement made up of four-variable expressions is called 4-SAT, a statement with all five-variable expressions is called 5-SAT, and so on. Why does this only go up and not down in numbers? With one variable in 1-SAT, if you set that variable to `true` you have a trivial problem. For 2-SAT, you can solve it in [linear time with depth-first search](https://en.wikipedia.org/wiki/2-satisfiability), a polynomial-time algorithm and thus not in the real of non-polynomial hard problems. From the _n_-SAT problems derives vertex cover and all of the other previous elementary reductions, so we'll stop there. ## Onto the next Daily Problem > Suppose we are given a subroutine which can solve the traveling salesman decision problem in, say, linear time. Give an efficient algorithm to find the actual TSP tour by making a polynomial number of calls to this subroutine. ## More problems to practice on For easier reduction problems, check out these from the homework set: 1. Problem 9-8. 2. Problem 9-10. 2. Problem 9-11. 2. Problem 9-12. ]]> 2018-11-22T20:17:00+00:00 Introduction to NP-Completeness and Intractable Problems https://www.adamconrad.dev/blog/reductions/ Thu, 22 Nov 2018 20:17:00 +0000 https://www.adamconrad.dev/blog/reductions/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/dynamic-programming-part-3/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Answers to the previous Daily Problem > Eggs break when dropped from great enough height. Specifically, there must be a floor `f` in any sufficiently tall building such that an egg dropped from the `f`th floor breaks, but one dropped from the `(f − 1)`st floor will not. If the egg always breaks, then `f = 1`. If the egg never breaks, then `f = n + 1`. > You seek to find the critical floor `f` using an `n`-story building. The only operation you can perform is to drop an egg off some floor and see what happens. You start out with `k` eggs, and seek to drop eggs as few times as possible. Broken eggs cannot be reused. Let `E(k,n)` be the minimum number of egg droppings that will always suffice. > (a) Show that `E(1,n) = n`. > (b) Show that `E(k,n) = Θ(n^1/k)`. > (c) Find a recurrence for `E(k,n)`. What is the running time of the dynamic program to find `E(k,n)`? Before we completely leave dynamic programming, let's keep it fresh in our minds and think about how we can break this problem down. ### Answer to part A To solve part A, we can intuit very simply that one egg requires starting at the bottom floor and, at worst, having to travel all of the way up each and every floor to prove what floor the egg breaks (if at all). Thus, 1 egg requires at most `n` floors. ### Answer to part B To solve part B, we're leveraging binary search: as we reduce our eggs by `k-1`, we cut our floors in half which is a cost of `n/2^k-1`. ### Answer to part C Finally, to solve part C, let's think of how we would solve this instinctually. If we have 100 floors and 5 eggs, it might initially make sense to just break the building up into 20 equal sections of floors. You may not get an exact match with such limited eggs but you know you'll be able to cover the building with all of your eggs at your disposal. Diving in, let's say that section 1 (floors 1 through 20 in our example) doesn't break the egg, but section 2 (floors 21 and 40) does. We now have `k-1` eggs to explore a subproblem of this greater problem, because now we know our bounds have been reduced from 1 to 100 down to 21 to 40. You repeat this process, continuing to subdivide sections with your remaining eggs. Generalizing this problem, we can say that, in the worst case, it will take us `n/s + s` attempts where `s` is the sections divided across the total number of floors `n`, plus the throw from the top section having the egg break at the top floor. What is the minimum section size we can use to reduce the number of hops? Because we said min (or max) in that last question, this should harken back to [differential calculus](https://math.stackexchange.com/questions/321819/differential-problem-find-the-maximum-and-minimum-value). So if `f(s) = s + n/s`, then our derivative function is `f'(s) = 1 - n/s^2`, set to zero is `0 = 1 - n/s^2`. Rearranging this, we can say that `1 = n/s^2`, and therefore our section size is `s = sqrt(n)`. To check for if this equation solves the minimum or the maximum, we now need to take the 2nd derivate of that equation, which is `f''(s) = 2n/s^3`. The rule states that if `s` is positive, we have `f''(s) > 0` and we have a minimum. Since you can't have negative number of sections for these floors, we now know we've reached our minimum floor section size. If that doesn't make sense, check out that Math Stackexchange link for a deeper explanation of how that differential equation works. Going back to our original equation, since we know that a section is expressed as a fraction of `n`, we can then turn our equation into using only 1 variable, which means that `n/s + s` is now `n/sqrt(n) + sqrt(n)`, `n/sqrt(n)` is just `sqrt(n)`, so now it's `sqrt(n) + sqrt(n)` or `2sqrt(n)` is the maximum number of tries we need to find our solution. All of that said, this means `E(k,n) = min( max(E(k,n-x), E(k-1,x-1)) + 1)`. What this is saying is that if you drop an egg and it doesn't break, we know the floors must be above where you're at and you still have your egg intact. The first function in the `max` equation is saying you would run this function with `n-x` remaining floors and the same number of `k` eggs. Conversely, if the egg breaks, we know the floors have to be below your current floor `x`, so that second `max` equation is saying you would run this function `E` with one less egg and the max floor being `x-1`. We need to take the worst-case scenario here which is why we are calling `max` over these two possibilities. We add in `+1` because you need to consider the cost of throwing the egg, which is a constant cost. Finally, for all possible `x` values to try against, we want to minimize the cost of these tries, which is why we wrap the _whole thing_ in a `min` function. To compute the runtime of this dynamic programming algorithm, we know that `E(k, n)` runs across all `k` eggs and all floors `n` on a minimization function summing across all values of `n`, which reduces down to `O(kn^2)`. ## Reductions: turning one problem into another If I had to summarize this section, it would be that *reductions are ways of expressing that two problems are identical.* If you can find a fast solution for problem A, it means, through transformation, you can find a fast solution for problem B. ### Examples #### Closest pair of integers One simple exmaple is finding the closest pair of numbers in an array. The naive implementation says find the difference across all numbers which is `O(n^2)`. The translation is to _sort the array first_ since the closest pairs will be adjacent to each other and therefore we can run the difference comparison in a linear fashion, bringing the complexity down to `O(nlog(n))`. #### Least common subsequence Another example we can utilize from our [dynamic programming section](/blog/2018-11-06-dynamic-programming-part-2/) involves the Longest Common Subsequence. We started that section off by talking about Edit Distance (i.e. spell checking resolves to either adding/removing/replacing/keeping a letter in a word). In many ways, LCS is just a _translation_ of Edit Distance: first we sort the numbers (again, by sorting we ensure the sequence is ever increasing), set `insertion` or `deletion` of numbers (to string together the longest sequence) to a cost of 1, set the `substitution` cost to infinity (since we don't allow substitutions in this problem), then simply subtract from our set the recursive call of the edit distance divided by two to get the lenght of the LCS. Why two? Every addition or subtraction has to happen as a pair; so the actual distance is about a "move" operation which doesn't affect the length of the array. #### Least common multiple & greatest common divisor Finally, a really common pairing of problems is in dealing with pairs of integers: finding their least common multiple (e.g. 24 and 36 has a LCM of 72) and greatest common divisor (e.g. for those same numbers the GCD is 12). In fact, we can use one to derive the other: if we take two numbers, say `x` and `y`, and multiply them together and divide over their greatest common denominator, you get the least common multiple! Just like our previous two examples where sorting _translates_ a hard problem into an easier one, finding the GCD _translates_ the LCM problem from a intractable problem to a very easy one. Normally you'd need the prime factorization which cannot be done efficiently. But by deriving the LCD using the [Euclidean algorithm](https://en.wikipedia.org/wiki/Euclidean_algorithm) we can turn this into an extremely efficient `O(n)` algorithm! We'll persue harder examples in the following posts, but for now let's move on to the next Daily Problem: ## Onto the next Daily Problem > We wish to compute the laziest way to dial given `n`-digit number on a standard push-button telephone using two fingers. We assume that the two fingers start out on the `*` and `#` keys, and that the effort required to move a finger from one button to another is proportional to the Euclidean distance between them. Design and analyze an algorithm that computes in time `O(n)` the method of dialing that involves moving your fingers the smallest amount of total distance. ## More problems to practice on The [last homework set for the class](http://www3.cs.stonybrook.edu/~skiena/373/hw/hw5.pdf) also covers chapter 9 on reductions as well, so here are your introductory problems into this space: 1. Problem 9-1. 2. Problem 9-2. ]]> 2018-11-22T20:17:00+00:00 Dynamic Programming in JavaScript Part 3 - Limitations https://www.adamconrad.dev/blog/dynamic-programming-part-3/ Thu, 08 Nov 2018 23:30:00 +0000 https://www.adamconrad.dev/blog/dynamic-programming-part-3/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/dynamic-programming-part-2/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Answers to the previous Daily Problem > A certain string processing language allows the programmer to break a string into two pieces. It costs `n` units of time to break a string of `n` characters into two pieces, since this involves copying the old string. A programmer wants to break a string into many pieces, and the order in which the breaks are made can affect the total amount of time used. For example, suppose we wish to break a 20-character string after characters 3, 8, and 10. If the breaks are made in left-right order, then the first break costs 20 units of time, the second break costs 17 units of time, and the third break costs 12 units of time, for a total of 49 steps. If the breaks are made in right-left order, the first break costs 20 units of time, the second break costs 10 units of time, and the third break costs 8 units of time, for a total of only 38 steps. Give a dynamic programming algorithm that takes a list of character positions after which to break and determines the cheapest break cost in `O(n^3)` time. As always, we're going to approach this with the three step process outlined before for solving dynamic programming problems: 1. Describe the problem recursively 2. Ensure parameters are polynomial and not exponential 3. If so, we can store partial results along the way to be efficient Can we describe this recursively? We know we want to compute something along the lines of `breakCost(str, start, end)` which says "find the minimum break cost of string `str` starting at position `start` and ending at position `end`". Taking a step back, we know each split costs `n`. With `n` characters we can possibly break after to exhaustively search for costs, we know that is now costing us `O(n^2)` because we need to create a partial solution matrix that fills in the cost for all possible breaks in all possible positions. Just like partitioning books or comparing words for edit distance, the cubic time comes in after we've created our matrix value and we're now calculating the cost of our cached value of our previous partition with the next break and finding the minimum cost of the two _for all positions from our start position to the end position_ in our `breakCost` function, which is also of some fraction of length `n`. So given we've had so many examples of dynamic programming working, when doesn't it work? ## Algorithmic limitations of dynamic programming In general we must use algorithms that are both correct and efficient. What conditions cause dynamic programming to fail under these circumstances? ### In correctness **Dynamic programming fails correctness if the recurrence relation cannot be expressed.** Imagine we wanted to find the most _expensive_ path from point A to be B without visiting a location more than once. Can we do it with dynamic programming? One problem is we can never enforce the _visiting more than once_ part. When you create a recurrence relation for this problem you're trying to compute a future cost for all remaining locations against your previously-stored solution up to some intermediate point. But in dynamic programming, computing that max cost could include points you've already visited. You could very easily create a cycle and an infinite loop here. The other problem is ordering. Sure, you know you start at A and end at B, but nothing dictates which order we transition from point-to-point in between A and B - is it biggest to smallest? Left to right? Without an order we also have the opportunity to infinite loop if the cost pushes us into one direction that is cyclical. As we saw in step 3 of our dynamic programming checklist, if we can't utilize a partial solution to help compute the state after the partial solution, the caching doesn't get us anywhere. In all of our previous examples we have an implicit ordering to evaluate things like strings or arrays. But when you have something like an undirected graph such as a network of cities on a map, you lose that order and you can't create a reliable recurrence relation. ### In efficiency **Dynamic programming fails efficiency without order of the input object.** Again, without order, we don't have a reliable recurrence relation. Which means we can't express in faster polynomial time but the full exponential time. Since the partial results do nothing for us, our recursion is unbounded, and recursion blows up to exponential time in the worst case (see our naive fibonacci algorithm in [part 1](/blog/dynamic-programming-part-1/) for proof). So yeah...if you have maps, networks, graphs without any sort of ordering...don't use dynamic programming. ## Concluding on Dynamic Programming Dynamic programming is hard. You need to look at this stuff a bunch of times before it clicks. If you still need more examples, check out [this page](https://people.cs.clemson.edu/~bcdean/dp_practice/) for 10 more problems to practice. If you follow the three-step process of defining the recurrence relation, expressing the parameters in polynomial time, and storing the partial results in an array/matrix, you can pretty much solve any dynamic programming problem. With enough examples and banging your head against the wall, these will start to become second nature (but you've got to practice!). And with that, onto the Daily Problem... ## Onto the next Daily Problem > Eggs break when dropped from great enough height. Specifically, there must be a floor `f` in any sufficiently tall building such that an egg dropped from the `f`th floor breaks, but one dropped from the `(f − 1)`st floor will not. If the egg always breaks, then `f = 1`. If the egg never breaks, then `f = n + 1`. > You seek to find the critical floor `f` using an `n`-story building. The only operation you can perform is to drop an egg off some floor and see what happens. You start out with `k` eggs, and seek to drop eggs as few times as possible. Broken eggs cannot be reused. Let `E(k,n)` be the minimum number of egg droppings that will always suffice. > (a) Show that `E(1, n) = n`. > (b) Show that `E(k,n) = Θ(n^1/k)`. > (c) Find a recurrence for `E(k,n)`. What is the running time of the dynamic program to find `E(k, n)`? ## More problems to practice on To wrap up dynamic programming the last problems in the chapter worth checking out from the homework set include: 1. Problem 8-18. 2. Problem 8-19. ]]> 2018-11-08T23:30:00+00:00 Dynamic Programming in JavaScript Part 2 - Examples https://www.adamconrad.dev/blog/dynamic-programming-part-2/ Tue, 06 Nov 2018 23:30:00 +0000 https://www.adamconrad.dev/blog/dynamic-programming-part-2/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/dynamic-programming-part-1/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Answers to the previous Daily Problem > Suppose you are given three strings of characters: `X`, `Y` , and `Z`, where `|X| = n`, `|Y| = m`, and `|Z| = n + m`. `Z` is said to be a _shuffle_ of `X` and `Y` iff `Z` can be formed by interleaving the characters from `X` and `Y` in a way that maintains the left-to-right ordering of the characters from each string. > (a) Show that `cchocohilaptes` is a shuffle of `chocolate` and `chips`, but `chocochilatspe` is not. > (b) Give an efficient dynamic-programming algorithm that determines whether `Z` is a shuffle of `X` and `Y` . Hint: the values of the dynamic programming matrix you construct should be `Boolean`, not numeric. The way I'm approaching this is thinking about how my brain would solve it before finding the recurrence relation. So to answer A: Two passes over the string array `Z`. First pass you are picking off letters from `X` until `X` is empty (i.e. so find `c`, now `X` is `hocolate`). If `X` is empty (meaning all letters were found _in order_), start over and run through `Z` again with the same heuristic but look for `Y` instead. If both `X` and `Y` are empty, you have found the shuffle. This will run in `O(Z)` time since you pass over `Z` twice. You won't find this shuffle with `chocochilatspe` because when `chips` is down to `ps` it will find `p` and then `s` will be left and the two variables `X` and `Y` will not be empty. Answering B is, of course, the real exercise Skiena is looking for us to undergo. Remember, determining a dynamic programming problem involves 3 steps: 1. Create a recurrence relation or recursive algorithm to describe the problem. 2. The parameters for your relation should be polynomial (i.e. recursion can take expontential time, if the number of solutions is less than exponential, that's polynomial or less possible solutions and thus cheap enough to pursue) 3. Organize your alogrithm so that you can leverage your cached partial results when you need them to save time in exchange for space. So answer A is not recursive...or is it? I now realize that the "picking off" of the variables can be done recursively by simply providing the function `X`, `Y`, and `Z` with the first letters chopped off. It might make more sense if I just write down the recursive algorithm now: ```js const weave = (str, z, newIdx = 0) => { if (!str.length) return z; for (let i = newIdx; i < z.length; i++) { if (str[0] === z[i]) { // storing the partial results into the arguments of the recursive function = DP! return weave(str.slice(1), z.slice(undefined,i).concat(z.slice(i+1)), newIdx); } else { newIdx++; } } return false; }; const isShuffle = (x, y, z) => { return y === weave(x, z); }; ``` Before I explain the algorithm, the fact that I created a recursive solution with a defined number of arguments proves parts 1 and 2 of the dynamic programming requirements, so the real trick is to explain part 3. The first trick I realized in writing this is that when you pick off all of the letters in `X`, if you have a shuffle, all that is left is `Y`, so we only ever need to "weave" in letters from `X` into `Z`! That means if the first word is picked off of the joint word, you can just compare the two strings for equality. Now for the recursion. To recurse, we store the smaller versions of `X` and `Z` in the arguments of the function. To maintain order, we have to always check the first letter of `X` or else we don't maintain the rules of the shuffle. The annoying part comes with the larger word. To store the partial results, this came down to removing the matched letter from the larger `Z` string _and_ ensuring when we start our `for` loop again within our recursed `weave` that we move along where we left off. This ensures that we only actually have to run this algorithm in `O(Z)` in one pass rather than 2 in answer A. In fact, if your `X` is solved before `Z` is empty, you already have `Y` so in the best case scenario `Z` is just `X.concat(Y)` in order. Therefore, you can actually solve this in the best-case sublinear time of `Ω(Z-Y)`. Having fun yet? Let's keep exploring with more examples! ## Example 1: binomial coefficients Everyone uses Fibonacci to talk about Dynamic Programming. The problem is if you only ever hear Fibonacci, your brain will train itself to remember the answer to calculating Fibonacci, and not necessarily how to apply Dynamic Programming. Let's take another example from the book: binomial coefficients. This is just another way of saying: > How many different ways can I count something? So if I have the numbers `1,2,3,4` and I want to pick 2 numbers from them, how many pairs will that generate? The answer is 6, because I can grab `(1,2)`, `(1,3)`, `(1,4)`, `(2,3)`, `(2,4)`, and `(3,4)`. A technique for quickly generating these coefficients is [Pascal's triangle](https://en.wikipedia.org/wiki/Pascal%27s_triangle) which you can read on your own for fun. Suffice it to say, it looks an awful lot like how you would generate a Fibonacci sequence. In this case, it's the sum of two numbers directly above it in the triangle. We can map this idea out as a function for our binomial coefficient pretty quickly in code. Remember that for Fibonacci we iterated on our code three times: once with recursion, once with recursion and memoization, and finally a third with in-place memoization over a loop. To illustrate that we don't need recursion to make Dynamic Programming work, let's see binomial coefficients mapped out as a loop: ```js const binomialCoefficient = (n, k) => { const bc = [[]]; // fill along the edges of the triangle for (let i = 0; i < n; i++) bc[i][0] = 1; for (let j = 0; j < n; j++) bc[j][j] = 1; // fill in the middle for (let i = 1; i < n; i++) for (let j = 1; j < i; j++) bc[i][j] = bc[i - 1][j - 1] + bc[i - 1][j]; return bc[n][k]; } ``` You'll see that the bigger `for` loop is just the definition of the [recurrence relation](https://en.wikipedia.org/wiki/Recurrence_relation#Binomial_coefficients) for binomial coefficients. You can see there's a pretty similar pattern here to the last Fibonacci algorithm we drew up. First, we fill in the default values and edge cases. Next, we iterate through our numbers to precompute values in an array. Finally, we draw values from that array to grab our coefficient value. The only real difference is that we have to supply two values now instead of one and the math of computation is slightly varied given that the recurrence relation isn't exactly the same as that of Fibonacci. So these two examples are pretty similar. For my last trick, let's go in a completely different direction to really cement this idea. ## Example 2: Checking for spelling errors Did you know that you can create a spellchecker with Dynamic Programming? Yeah, you can! It's the same formula: try recursion, then memoize, then optimize. So what happens when you spell something? You're dealing with 4 scenarios for the spellchecker to validate: 1. **It matched.** Each letter of `soil` is evaluated and compared to be correct. 2. **It swapped.** For `soel`, you have to swap the `e` for an `i` when you get there. 3. **It added.** For `sol`, you have to add in the `i` when you get to `l` too early. 4. **It removed.** For `soile`, you have to remove the `e` when you expected the string to end. And all but the first scenario carries a cost since the correct string requires no changes but each of the other three requires a change. To map this out, we have a function that takes our correct word (from something like a dictionary) and compares it to the word we typed out. To do this we're going to create a matrix of all possibilities from how to get from a letter in our target word to a letter in our compared word. I'll write it all out now then we can explain it a bit more afterwards: ```js const compareWords = (targetWord, comparedWord) => { const costs = [0,0,0]; // each slot is associated with a MATCH, INSERT, or DELETE cost const INSERT_COST = 0; const DELETE_COST = 0; const matrix; let i, j, k; // counters // initialize the matrix 2D array // our first row for (i = 0; i < targetWord.length; i++) { matrix[0][i].cost = 0; if (i > 0) { matrix[0][i].parent = 'I'; // for insert cost } else { matrix[i][0].parent = -1; // no parent since we're in the top corner } } // our first column for (j = 0; j < compareWord.length; j++) { matrix[j][0].cost = 0; if (j > 0) { matrix[j][0].parent = 'D'; // for deletion cost } else { matrix[j][0].parent = -1; } } // determine costs for (i = 1; i < targetWord.length; i++) { for (j = 1; j < compareWord.length; j++) { // match cost costs[0] = matrix[i-1][j-1].cost + (compareWord[i] === targetWord[j] ? 0 : 1); // insert cost costs[1] = matrix[i][j-1].cost + 1; // always costs 1 to insert a letter into our target word // delete cost costs[2] = matrix[i-1][j].cost + 1; // always costs 1 to delete a letter from our target word // assume match cost is cheapest at first matrix[i][j].cost = costs[0]; matrix[i][j].parent = 'M'; // for an exact match being made for (k = 0; k < 3; k++) { if (costs[k] < matrix[i][j].cost) { matrix[i][j].cost = costs[k]; matrix[i][j].parent = k === 0 ? 'M' : k === 1 ? 'I' : 'D'; } } } } // find the final cost at the end of both words return matrix[targetWord.length-1][compareWord.length-1].cost; } ``` So what is this really doing? There's really 4 parts to this solution: ### 1. Matrix initialization We know that the edges along the rows and columns have a fixed cost at the extremes. When you compare a word to nothing, the cost scales linearly by letter as you add one more. And we know that a blank word carries no cost. ### 2. Cost calculation Next, we compare the current position in the matrix to some previous adjacent cell. That leaves us with 3 calculations to compute: 1. **Moving up and to the left.** We assume we matched the previous cell (i.e. if we matched, both characters can advance) so all that is left to calculate is if the cell matches or not. It costs 0 for an exact match and 1 if we have to substitute one letter for the other. 2. **Moving up.** Our target character had to advance once without advancing our comparison word. In other words, we had to insert a character which carries a cost of 1 no matter what. 3. **Moving left.** Converseley, if our target character cannot advance but our comparison word did, we would need to delete a character to reduce matching costs which, again, has a price of 1 no matter what. ### 3. Optimize our path forward Now that our costs are calculated in comparison to previous adjacent cells, we can find the minimum cost thus far. We just iterate through all of the cost enumerations and pick the cheapest one of the three. From here we can begin moving onto the next cell in the matrix. ### 4. Determine the total cost Once we've enumerated through every cell in the cost matrix, we simply find the final cost at the very bottom of our matrix, which has compared every letter in our `targetWord` with every letter in our `compareWord`. That final cost is the cheapest cost to navigate from the target word to our comparison word. ## String comparisons solve a few problems The beauty of this solution is that it can really work in a variety of other problems: ### Substring matching Need to spellcheck just a word within a whole text? Initializing the first row removes the conditional to assign an insertion cost since we're not trying to build one string into the other. Then in step 4, rather than count the cost of the letter in the big text, we just find the cost at the last point in our substring rather than the whole text. ### Longest Common Subsequence (LCS) This is one of those classic problems in Dynamic Programming: you want to find all of the shared characters between two words if you only advance character-by-character. The example in the book uses `democrat` and `republican` with a LCS of `eca` (i.e. `e` is followed by a `c` is followed by an `a` in each of the two words, and it's of length 3 which is longer than any single letter in both words and is longer than, say, `ra` which is also a subsequence in the two words). Leveraging our existing code, if we just make the cost of a substitution in our match calculation `2` instead of `1`, the cost is disproprotionately greater than any other cost and we'd never want to swap out letters. Why is this the solution? Think about it this way: all we ever want are matches. We want all of the other letters out of our way, which either means insertion or deletion just to get the exact sequence of something like `LETTER-e-LETTER-LETTER-c-LETTER-LETTER-a`. The least number of insertions/deletions to form an equal patter between our target and comparison words gets us our common subsequence. ## Example 3: Partitions Imagine you have a bunch of books in a library. The head librarian has already given you all of the books organized by number of pages in the book. He then gives you a few book dividers to segment the books into sections, ideally as uniformly as possible. How would you do it? ### A naive approach Let's say the example is super basic: 2 dividers (3 sections) with 9 books: 100 pages, 200 pages...all the way up to 900 pages. The naive way to handle this is to divide the sum of all pages of all books by the number of sections and add in your divider when your ideal division is exceeded by the next book. So with our example we have 4,500 total pages over 9 books to divide in 3 sections, which makes each section ideally 1,500 pages. The first section creeps over 1,500 pages after book 5, so we put a divider there and it helps that we hit exactly 1,500 pages in this section. The next section creeps over after book 7, so we put the next divider there but we only have 1,300 pages in this section. We've used up all dividers, so we're left with our last section having books 8 and 9, which has a division of 1,700 pages. This works because while 1,500 is the ideal, the last two sections have a delta of only 200 pages in either direction, so no section is worse than the other and there's no other possible configuration that would further minimize the cost from the maximum distance of the ideal. Of course, this is a pretty ideal example and we haven't applied any dynamic programming principles to achieve this result, which means we know there is more on how to approach this. Can we leverage any of this previous knowledge to improve our algorithm? ### Using dynamic programming The truth is we need an exhaustive search involving recursion to solve this problem. Describing this problem with recusion and storing partial results along the way gives us our dynamic programming solution. How will we manage this? Remember that to solve the full problem, we should be able to describe a subset of the problem. In other words, if we solved the above example with only 8 books, that should give us information to help solve the total solution with 9 books. And in the other direction, that should should be based on a solution when we only have the first 7 books. What is the cost of adding this last divider? #### Maximum sectional cost That cost is the sum of the remaining elements. The cost we would choose is the maximum of either this current section or the largest section thus far. Why? Either we are breaking new ground and we've forced an increase in our total worst cost per section, or we have remained within the cost cap we have defined in all previous sections. This is like if our last book in our example is 1,000 pages instead of 900 - the cost is 1,800 pages which is 300 pages from our ideal instead of the previous delta which is 200 pages (i.e. 600 + 700 = 1,300 which is 200 pages off of the first section at the ideal 1,500 pages). #### Minimum sectional distance Now we mentioned in the previous example that while we're totalling some maximum, we want to minimize that delta from the ideal, so how do we minimize the total cost per section? Since we have `k` sections (and `k-1` dividers), we need to minimize the cost with the remainder of books if we have `k-1` sections (or `k-2` dividers). Which is to say, we need to know if the previous remainder was trending in the right direction to set us up for a reasonable maximum with our current last divider available. And because we are trying to solve this as a subset of the previous problem, _this is our recurrence relation_. We've got enough words here and not enough code, so let's try and write out what we have above: ```js const printBooks = (books, start, end) => { books.forEach(book => console.log(book)); console.log('\n'); }; const reconstructPartition = (books, dividers, length, divisions) => { if (divisions === 1) return printBooks(books, 1, length); reconstructPartition(books, dividers, dividers[length][divisions], divisions-1); return printBooks(books, dividers[length][divisions]+1, length); }; const partitionize = (books, divisions) => { let length = books.length; let allSums = [0]; let cachedValues, dividers, cost; // collect the cumulative sums for (let i = 1; i <= length; i++) allSums[i] = allSums[i-1] + books[i]; // boundary condition - one division is just the total sum for (let i = 1; i <= length; i++) cachedValues[i][1] = allSums[i]; // boundary condition - one value across all sums is just one section for (let j = 1; j <= divisions; j++) cachedValues[1][j] = books[j]; // store partial values as we walk the recurrence relation for (let i = 2; i <= length; i++) { for (let j = 2; j <= divisions; j++) { cachedValues[i][j] = Number.POSITIVE_INFINITY; // can't get worse than the max cost for (let x = 1; x <= i-1; x++) { cost = Math.max(cachedValues[x][j-1], allSums[i]-allSums[x]); if (cachedValues[i][j] > cost) { cachedValues[i][j] = cost; dividers[i][j] = x; } } } } return reconstructPartition(books, dividers, length, divisions); }; ``` The code here should be self-explanatory: it is meant to work out the algorithm above. The real key is iterating through all of the books and all of the divisions to get the minimum cost after finding the maximum between the previously cached value and the current sum into the partition. If we hit that, increment our dividing line and store the next cached value. Study it a few times and if you have any questions feel free to hit me up on Twitter. We've covered a few examples here and there's a bunch to unpack so if you need to read over these a few times that's okay! ## Onto the next Daily Problem > A certain string processing language allows the programmer to break a string into two pieces. It costs `n` units of time to break a string of `n` characters into two pieces, since this involves copying the old string. A programmer wants to break a string into many pieces, and the order in which the breaks are made can affect the total amount of time used. For example, suppose we wish to break a 20-character string after characters 3, 8, and 10. If the breaks are made in left-right order, then the first break costs 20 units of time, the second break costs 17 units of time, and the third break costs 12 units of time, for a total of 49 steps. If the breaks are made in right-left order, the first break costs 20 units of time, the second break costs 10 units of time, and the third break costs 8 units of time, for a total of only 38 steps. Give a dynamic programming algorithm that takes a list of character positions after which to break and determines the cheapest break cost in `O(n^3)` time. ## More problems to practice on For even more dynamic programming practice problems, check out Problem 8-14 in the book. ]]> 2018-11-06T23:30:00+00:00 Star athlete or engineer? The Big Five in tech are paying like the Big Four in sports https://www.adamconrad.dev/blog/the-big-four-sports-or-tech-companies/ Sat, 03 Nov 2018 23:30:00 +0000 https://www.adamconrad.dev/blog/the-big-four-sports-or-tech-companies/ 2018-11-03T23:30:00+00:00 Dynamic Programming in JavaScript Part 1 - The Basics https://www.adamconrad.dev/blog/dynamic-programming-part-1/ Thu, 01 Nov 2018 23:30:00 +0000 https://www.adamconrad.dev/blog/dynamic-programming-part-1/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/shortest-paths/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Answers to the previous Daily Problem > _Multisets_ are allowed to have repeated elements. A multiset of `_n_` items may thus have fewer than `_n!_` distinct permutations. For example, `{1,1,2,2}` has only six different permutations: `{1, 1, 2, 2}, {1, 2, 1, 2}, {1, 2, 2, 1}, {2, 1, 1, 2}, {2, 1, 2, 1}, and {2,2,1,1}`. Design and implement an efficient algorithm for constructing all permutations of a multiset. This question is basically asking "give me all of the shapes of these items". Each permutation is a shape, or arrangement, of the 4 components, 2 of which are similar. One approach would be to leverage backtracking, trying to figure out from those end results how do you get to the result with `_n-1_` values left. This is a call back to our previous articles to re-assert your knowledge of backtracking algorithms. I will offer an alternative approach below: Another naive approach is to make an `_n^2_` loop between the number of possibilities and the number of slots and just insert each item into each slot. So 1 goes to first slot, then second slot, then third...then the second one goes into the first slot, then second slot... and you build out the possibilities by filling in the remaining options. The problem is that doing the example above would arrive at duplicates (inserting the first 1 into the 4 slots yields the same shape of numbers as would if you inserted the second 1 into the 4 slots). So how do we eliminate these possibilities? **Caching.** As we'll see in this article, when we cache previous results, we can throw out unnecessary moves like constructing an unnecessary permutation. Once we've inserted the four options for the first 1, we don't ever have to calculate them again for any other 1s because they yield the same output. Same with the 2s. So in the multiset example above, there are 24 different arrangements, duplicated twice (once for the 1 and once for the 2). In other words, we have `6x2x2 = 24`, or 6 permutations that are duplicated once for `1` (x2) and once again for `2` (x2). If that doesn't make sense to you, check out how we handle generalized recurrence relations below. ## Dynamic Programming = Recursion + Memoization Functions that recurse call themselves. _Memoization_ is just storing previously-computed results. Honestly, the best way to think about this is with a classic example, the Fibonacci Sequence: ```js const fibonacci = (n) => { if (n === 1) return 0; if (n === 2) return 1; return fibonacci(n - 1) + fibonacci(n - 2); } ``` All this algorithm says is > To get the number you're looking for, add up the two previous numbers in the sequence The first few numbers of Fibonacci are `0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55`. Simple, right? What's the 24th number of Fibonacci? Oof! Yeah, this one gets rough pretty quick (in fact, it is _exponentially_ bad!) So we've got the recursive part in this function, but we're missing the memoization. If we add in memoization, this problem becomes virtually linear in time, with the tradeoff of an array to store previous results: ```js const fibonacciCached = (n, cache = [0, 1]) => { if (n === 1) return 0; if (n === 2) return 1; if (!cache[n]) cache[n] = fibonacciCached(n - 1) + fibonacciCached(n - 2); return cache[n]; } ``` Literally one extra line to cache the result. The first solution will heat up your laptop if you find the 45th Fibonacci number from how hard it will be working. The latter will run extremely fast. All we did was make sure that if we've found this number before, we look it up in an array (constant-time lookup) as opposed to computing it all over again. Simple, right? Well, not exactly. For starters, this only works if you truly refer to the things you've computed. It works great here because we always refer to all of the numbers computed (since it's always referring to lower numbers). But what about backtracking a path or with depth-first search? You may not always revisit everything there, in which case you've stored things unnecessarily. Recursion also makes things painful. Recursion really eats up stack space. If we can cache with primitives and not recurse, that's even better. With Fibonacci, it's pretty easy: ```js const fibonacciArr = (n, cache = [0, 1]) => { for (i = 2; i <= n; i++) cache[i] = cache[i - 1] + cache[i - 2]; return cache[n]; } ``` Brilliant! The shortest and the fastest solution yet with _`O(n)`_ time and space! Now, to get really weird, we have to remember one thing that you probably noticed as you were counting Fibonacci: you only ever looked back two numbers when finding the next. Could our computer do that and only have to store 2 numbers as opposed to _n_? You bet! ```js const fibonacci2 = (n) => { let first = 0, second = 1, next; if (n <= 1) return 1; for (let i = 2; i < n; i++) { next = first + second; first = second; second = next; } return first + second; } ``` A couple of more lines, but we all we're doing is computing the next number by adding the two previous numbers (like we first specified in our definition of Fibonacci). To save space, we overwrite the variables and keep chugging along. The solution still runs in _`O(n)`_ time but the space is now constant (the two recorded numbers plus the pointer to our upcoming to-be-computed number). One final thing I'll say about these last two examples: you may have noticed we traded in the full caching solution for less space. If constant space and linear time is fine with you (which it probably is) then stick with this. However, if you plan to call a function over, and over, and over again, permanently storing a singleton of the cached values, it might be better to call something like `fibonacciCached` since you can always check if someone else has done this work before. The storage space is linear while the lookup time is constant for all but the uncharted territory. If you expect to visit previous values often, consider this alternative. ## The basics are just scratching the surface We've now introduced the fundamental aspects of Dynamic Programming with a few examples and variations. Read over the code a few times to let it sink in (trust me, it took a _really_ long time for me to see this). We've got 2 more parts of this chapter to cover, so in the meantime, take a look at the Daily Problem and we'll see you soon! ## Onto the next Daily Problem > Suppose you are given three strings of characters: `X`, `Y` , and `Z`, where `|X| = n`, `|Y| = m`, and `|Z| = n + m`. `Z` is said to be a _shuffle_ of `X` and `Y` iff `Z` can be formed by interleaving the characters from `X` and `Y` in a way that maintains the left-to-right ordering of the characters from each string. > (a) Show that `cchocohilaptes` is a shuffle of `chocolate` and `chips`, but `chocochilatspe` is not. > (b) Give an efficient dynamic-programming algorithm that determines whether `Z` is a shuffle of `X` and `Y` . Hint: the values of the dynamic programming matrix you construct should be `Boolean`, not numeric. ## More problems to practice on Now that [homework 5](http://www3.cs.stonybrook.edu/~skiena/373/hw/hw5.pdf) is out, here are a few problems that are relevant to the dynamic programming we've discussed so far: 1. Problem 8-3. 2. Problem 8-7. Think on this and we'll check it out in the next few days as we explore more examples of Dynamic Programming. ]]> 2018-11-01T23:30:00+00:00 Combinatorial Optimization in JavaScript https://www.adamconrad.dev/blog/combinatorial-optimization-in-js/ Tue, 30 Oct 2018 23:30:00 +0000 https://www.adamconrad.dev/blog/combinatorial-optimization-in-js/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/maximum-flow-algorithms-networks-js/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). By working backwards and using heuristics to _approximate_ a correct solution, you can get close enough to a correct solution that is acceptable without having to run your computers for all of eternity. Before we get into that, let's answer the previous article's Daily Problem: ## Answers to the previous Daily Problem > Let `G = (V,E)` be a directed weighted graph such that all the weights are positive. Let `v` and `w` be two vertices in `G` and `k ≤ |V|` be an integer. Design an algorithm to find the shortest path from `v` to `w` that contains exactly `k` edges. Note that the path need not be simple. Keep track of all `i` hops to all `i` vertices. The shortest path for `k` edges is just the shortest path for `k-1` edges plus some edge in our `i`. We can _backtrack_ recursively from the end towards the beginning, appyling the lightest-weight edge as we head backwards towards our starting vertex. This is the basis for how we will approach problems going forward. ## Backtracking as a means of defining heuristics Have you ever played Sudoku? Then you can understand backtracking! Backtracking, as we alluded to earlier, is the systematic way of going through all possible configurations of a search space, whether it is how they are arranged (_permutations_) or how they are divided (_subsets_). At each step in backtracking, we are trying to solve a portion of the total problem. Let's say we want to get from one end of the graph to the other; point A to point Z. And further we know that there are 5 different ways to get there. What backtracking will do is figure out all of the ways to get from A to Z by adding 1 more point to the solution. It will then ask "did we get to Z?" if it did, we can print out the solution, if not we have to see if we hit a dead end or if we can extend it further. Since we're going _deeper_ into a list of possibilities leading from A to Z, it makes sense that we could implement backtracking with DFS. BFS also works since we are trying to find all possible solutions (i.e. going wide will accomplish the goal just as going deep will), but it takes up much more space since we're guaranteeing to hit all of the dead ends rather than only going deep on the ones that get us closer from A to Z. I was going to post some code that offers a way of explaining backtracking, but to be honest after looking over the algorithm it's much easier to understand as a recipe list and then apply that to a specific problem. Therefore, let's generalize backtracking into this recipe: 1. For a given solution `A`, if it's a complete solution that gets us from A to Z, print it! If we haven't found a solution yet, move on to step 2 2. Go one level deeper - given `A`, add one more step to your partial solution. This is like going to the next fork in the road. 3. Now go down each fork in the road and proceed to step 1 with each fork as the new `A`. So if we hit a fork in the road with 3 more streets to travel down, you're re-running backtrack on `A'1`, `A'2`, and `A'3`. If we've done all of the exploring possible or hit a dead end, proceed to step 4. 4. Now that we've explored, we need to move out of where we've traveled - either to explore another road because our previous one has terminated, or to end the algorithm because we've explored all roads. And that's the basic gist of backtracking with DFS. From here we'll look at a specific example that I mentioned earlier: Sudoku. ## Backtracking by example: Sudoku You've heard of [Sudoku](https://en.wikipedia.org/wiki/Sudoku), right? You wouldn't guess it from the name, but this _French_ puzzle game is inherently combinatorial in nature. The object of the game is to take a 9×9 grid of blanks and fill it with the digits 1 to 9. The puzzle is completed when every row, column, and sector (3×3 subproblems corresponding to the nine sectors of a tic-tac-toe puzzle) contain the digits 1 through 9 with no deletions or repetition. Backtracking is an excellent way to solve this problem because the game starts partially filled in. That is, Step 1 is already filled out for us (we have a given partial solution `A`) and we need to take incremental steps to get to a completed board. To begin, we need to answer 3 questions: 1. **How do we set up this problem?** Since we have a 2D board matrix, a 2D array will suit us well in JavaScript. And what do we put in these array values? The state of the board, which we hope to fill out completely by the end of the exercise. 2. **What are the possibilities at each step to evaluate?** The set of values that haven't been taken yet in a given row, column, or sector. 3. **How do we know when to backtrack?** As soon as we are out of possibilities from Question 2 (i.e. we either have filled out the board or we have an invalid board). With all of that in mind, we can now create a custom backtracking algorithm to solve a Sudoku board! ```js class SudokuSolver { const UNASSIGNED = 0; const DIMENSION = 9; const SECTOR = Math.sqrt(DIMENSION); constructor(board) { this.solve(board); } solve(board, row, col) { const [row, col] = this.findUnassignedLocation(row, col); // step 1 - print solution else find it if (row === -1) return this.processSolution(board); // step 2 - DFS; go one level deeper for (let number = 1; number <= DIMENSION; number++) { if (this.isValid(row, col, number)) { board[row][col] = number; // step 3 - if we've hit a valid spot to explore, go in! if (this.solve(board, row, col)) return true; board[row][col] = UNASSIGNED; } } // step 4 - we've done everything; time to backtrack! return false; } findUnassignedLocation(board, row, col) { let assigned = false; let currentLocation = [-1, -1]; while (!assigned) { if (row === DIMENSION) { assigned = true; } else { if (board[row][col] === UNASSIGNED) { currentLocation = [row, col]; assigned = true; } else { if (col < 8) { col++; } else { row++; col = 0; } } } } return currentLocation; } isValid(board, row, col, number) { return ( this.isValidRow(board, row, number) && this.isValidColumn(board, col, number) && this.isValidSector( board, Math.floor(row / SECTOR) * SECTOR, Math.floor(col / SECTOR) * SECTOR, number ) ); } isValidRow(board, row, number) { for (var col = 0; col < DIMENSION; col++) { if (board[row][col] === number) return false; } return true; } isValidColumn(board, col, number) { for (var row = 0; row < DIMENSION; row++) { if (board[row][col] === number) return false; } return true; } isValidSector(board, row, col, number) { for (var currentRow = 0; currentRow < SECTOR; currentRow++) { for (var currentCol = 0; currentCol < SECTOR; currentCol++) { if (board[row + currentRow][col + currentCol] === number) return false; } } return true; } processSolution(board) { let boardStr = ''; for (let i = 0; i < DIMENSION; i++) { for (let j = 0; j < DIMENSION; j++) { boardStr += grid[i][j]; } boardStr += '\n'; } console.log(boardStr); } } ``` Alright, lots to unpack here! Let's go through the four steps one-by-one and see how we're achieving backtracking for our Sudoku solver: ### 1. Determine if the Sudoku board is solved It may seem counterintuitive to check if you've solved the solution before you've actually gone and solved it, but we're doing this first in the algorithm because we don't want to go down any unnecessary roads. Remember, algorithms are as much about correctness as they are efficiency. So before anything else, we need to see if we have anymore unassigned locations. That just means asking our program "are there any spots on the Sudoku board left to be filled?" If not, we process a solution (which just means printing out the items in our 2D array) otherwise we add our next step and dive in. ### 2. Loop through the numbers Now that we have a cell, we need to see which number(s) fit into that cell. Since there are 9 numbers on a Sudoku board to work with, we iterate through all of those; and that's where the DFS part comes in. ### 3. Dive into valid configurations Let's say we know that 3, 5, and 8 are all valid numbers that could go in our current cell. The DFS recursion basically says "let's assume we fill out our board with the 3 here, will we be able to fill out the rest of our board?" And we try to fill out that board in the universe where 3 is in this cell. ### 4. Assign or backtrack If we reach the end state, we're done. If we reach an invalid board, then we know we have to backtrack through this whole recursive dive until we get back to where we set the 3. We then try setting the 5, and go through the whole thing again, filling out all of the remaining slots until we either reach the end state or an invalid one. Repeating with all of the valid candidates until there are no more. And that's it! Combinatorial optimization sounds like a mouthful when taken at face value. But in reality, it's just doing a depth-first search on all possibilities until we find the right one (or ones) to solve our problem! ## Onto the next Daily Problem Now we can apply our skills with backtracking into the Daily Problem: > _Multisets_ are allowed to have repeated elements. A multiset of `_n_` items may thus have fewer than `_n!_` distinct permutations. For example, `{1,1,2,2}` has only six different permutations: `{1, 1, 2, 2}, {1, 2, 1, 2}, {1, 2, 2, 1}, {2, 1, 1, 2}, {2, 1, 2, 1}, and {2,2,1,1}`. Design and implement an efficient algorithm for constructing all permutations of a multiset. ## More problems to practice on Implementing a heuristic for the [graph bandwidth problem](https://en.wikipedia.org/wiki/Graph_bandwidth) is the recommended way from the class to start diving into combinatorial search problems. The [Eight-Queens Problem](https://medium.freecodecamp.org/lets-backtrack-and-save-some-queens-1f9ef6af5415) is another classic combinatorial backtracking algorithm to implement if you're looking for something slightly different. Finally, the class-sponsored [Homework 4](https://www3.cs.stonybrook.edu/~skiena/373/hw/hw4.pdf) focuses purely on combinatorial search. If you have an implementation to any of these, be sure to [share it with me on Twitter](https://twitter.com/theadamconrad)! ]]> 2018-10-30T23:30:00+00:00 Maximum Flow Algorithms for Networks in JavaScript https://www.adamconrad.dev/blog/maximum-flow-algorithms-networks-js/ Tue, 23 Oct 2018 23:30:00 +0000 https://www.adamconrad.dev/blog/maximum-flow-algorithms-networks-js/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/shortest-paths/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Answers to the previous Daily Problem > Let `G = (V, E)` be an undirected weighted graph, and let `T` be the shortest-path spanning tree rooted at a vertex `v`. Suppose now that all the edge weights in `G` are increased by a constant number `k`. Is `T` still the shortest-path spanning tree from `v`? The answer is no, and it will make sense with a simple example. Suppose you have a triangle with two edges equal to 3 each and one edge equal to 7 (for all of your mathematicians out there, this is a contrived example so don't message me that these edge weights couldn't possibly construct a real triangle). The shortest path is going to be along the 2 3-weighted edges (with a total path weight of 6). But if you add 10 to all 3 edges, you now have 13, 13, and 17, respectively. Now 17 is the shortest path. What was once the longest edge is now the shortest path, because if you went along the original shortest path, you'd travel 13+13 for a total path weight of 26. ## Maximum flow and bipartite matching We mentioned in a previous article about what a _bipartite graph_ is, but at a high level, it's just a way to divide a graph into two distinct sections. For example, a set of job vertices and a set of people to do those jobs (also represented as vertices). How can we connect these two sides in a way that the maximum amount of work gets done? This is called the _network flow problem_ because we want to be able to maximize the capacities from one side to the other so that it flows with maximum efficiency and capacity. ### Why is this important to you? Maximum flow algorithms are often solved with something called _augmented paths_. Think of it like pouring coffee through a filter or a clogged drain. If one pipe is running slow or only has so much capacity, wouldn't you divert more water to the bigger pipe or the one that isn't clogged? Or imagine you're a programmer on the city utilities division and you need to figure out how much water you can pump out to the town from a desalinization plant. How do you figure out where the water can go without wasting it or fully utilizing all of the available pipes? **As a front-end developer, you might have to map this relationship on a visualization.** ### Residual graphs We can use something called a **residual graph** to figure out this network. A residual graph is like a regular graph but it also includes _flow_ and _capacity_. So if we look at a graph `G` and we layer on a residual graph `R`, we can now see, for every edge, how much flow can be pushed along an edge and, based on the weight, what the capacity that can be applied as well. With all of this extra data, we can find the inefficiencies as well as our limits. Additionally, we can see _what direction_ we can make use of the weights of an edge. For example, if to travel from `a` to `b` along edge `(a,b)` the weight is 12, we might find out from our residual graph that from `a->b` our flow is 10 and from `b->a` our flow is only 2. Whereas another edge may reveal that regardless of direction, the price is 5. These are very key points when trying to figure out how to best utilize our network. In fact what this tells us is that the flow on our 2nd edge is maximized, and that our primary edge of `(a,b)` has some extra capacity to utilize. How much capacity? The _minimum cut_ of that edge. Let's look at that again. The weight of `(a,b)` is 12, with one cut at 10 and one cut at 2, so **the amount we can optimize is by the minimum cut** (in this case, 2). In fact we can generalize this augmented paths problem into the algorithm represented below: ```js class Vertex { constructor( capacity = 0, flow = 0, neighbor = null, nextVertex = null, residualCapacity = 0 ) { this.capacity = capacity; this.flow = flow; this.neighbor = neighbor; this.nextVertex = nextVertex; this.residualCapacity = residualCapacity; } } class Graph { resetSearch() { for (let i = 0; i < this.vertices.length; i++) { PROCESSED[i] = false; VISITED[i] = false; PARENTS[i] = null; } } findEdge(start, end) { let path = this.connections[start]; while (path) { if (path.adjacencyInfo === end) return path; path = path.nextVertex; } return path; } augmentPath(start, end, parents, volume) { if (start === end) return; let edge = this.findEdge(parents[end], end); edge.flow += volume; edge.residualCapacity -= volume; edge = this.findEdge(end, parents[end]); edge.residualCapacity += volume; this.augmentPath(start, parents[end], parents, volume); } pathVolume(source, sink, parents) { if (parents[parents.length-1] === -1) return 0; let edge = this.findEdge(parents[sink], sink); if (source === parents[sink]) { return edge.residualCapacity; } else { return Math.min( this.pathVolume(source, parents[sink], parents), edge.residualCapacity ); } } edmondsKarp(source, sink) { this.addResidualEdges(); this.resetSearch(); this.bfs(source); let volume = this.pathVolume(source, sink, this.PARENTS); while (volume > 0) { this.augmentPath(source, sink, this.PARENTS, volume); this.resetSearch(); this.bfs(source); volume = this.pathVolume(source, sink, this.PARENTS); } } } ``` In this case, the `source` is an edge of a bipartite graph (let's say one section is `L` and the other section is `R`) and the `sink` is the edge in the other section of the graph. These edges are special because they are connected to all other edges on their side by a weight of 1. Their job is to provide easy access to all edges in the graph and map out all of the main paths and find the efficiencies. We do this via breadth-first search to look for any path from the `source` to the `sink` that increases the total flow. The algorithm, known as Edmonds-Karp, is done when we have no more extra volume left to optimize. > Note: for the very astute, you might recognize this as the [Ford-Fulkerson method](https://en.wikipedia.org/wiki/Ford%E2%80%93Fulkerson_algorithm). Though not fully recognized as an algorithm, Edmonds-Karp is an _implementation_ of Ford-Fulkerson for maximum flow problems on networks. Now that we've seen a bunch of algorithms for moving around a graph, here's a few things to keep in mind: 1. **Map the problem. Solve with an algorithm.** Think of these algorithms as your ace-in-the-hole. They're your ammunition for firing at problems. But you need to know what to fire at. 2. **Create the framework.** All problems can be fit into some sort of framework. Once you know the problems, the solutions are easy since they've already been given to you. 3. **Practice.** Problems are easier to recognize and slot into that framework if you see a lot of them. Make sure you hit the books and give some coding problems a try! And with that last tip, let's get on to the Daily Problem and apply some of these recent algorithmic concepts! ## Onto the next Daily Problem > Let `G = (V,E)` be a directed weighted graph such that all the weights are positive. Let `v` and `w` be two vertices in `G` and `k ≤ |V|` be an integer. Design an algorithm to find the shortest path from `v` to `w` that contains exactly `k` edges. Note that the path need not be simple. ## More problems to practice on 1. Problem 6-24. 2. Problem 6-25. ]]> 2018-10-23T23:30:00+00:00 How the New York Times uses React https://www.adamconrad.dev/blog/how-the-new-york-times-uses-react/ Mon, 22 Oct 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/how-the-new-york-times-uses-react/ 2018-10-22T10:28:00+00:00 Why you should start a newsletter even if you never grow a big audience https://www.adamconrad.dev/blog/why-you-should-start-a-newsletter-even-if-you-never-grow-a-big-audience/ Sat, 20 Oct 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/why-you-should-start-a-newsletter-even-if-you-never-grow-a-big-audience/ 2018-10-20T10:28:00+00:00 Shortest Path Problems in the Real World https://www.adamconrad.dev/blog/shortest-paths/ Thu, 18 Oct 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/shortest-paths/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/minimum-spanning-trees/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Answers to the previous Daily Problem > Suppose we are given the minimum spanning tree `T` of a given graph `G` (with `n` vertices and `m` edges) and a new edge `e = (u, v)` of weight `w` that we will add to `G`. Give an efficient algorithm to find the minimum spanning tree of the graph `G + e`. Your algorithm should run in `O(n)` time to receive full credit. Prim's or Kruskal's will suffice in solving this, but to run this in linear time we'd probably prefer Kruskal's. A graph `G + e` is no different to solve than `G` since `G` is just a subtree of `G + e`. So why should the algorithm change? Kruskal's can be used as is, but here's the distinguishing factors to look out for: * If `e` is a backward edge, we ignore it because it's not helping us explore more of the tree. _But_ this is important because if we detect a cycle, we've now found the edge we can throw out (if it's not our current edge `e`) in our DFS exploration. * If `e` is a forward edge, we also ignore it for the same reason, since forward edges have already been explored. * If `e` is a tree edge, we have to compare it against all other edges leading to `v` to determine if we have to include it in our MST. If `e`'s weight `w` is less than that of the other incident weights, we will add `e` to our MST for `G + e`. Otherwise our MST hasn't changed from graph `G`. * If `e` is a cross edge, it doesn't factor into the MST and we can safely ignore it. This runs in `O(n)` time because our DFS to find the new edge only costs `O(n)` in a sparse graph, and once we're there it's just some constant-time operations to do comparisons to see if the new edge will be swapped into the MST or not. ## Applications for shortest paths As we saw above, transporation problems (with solutions like Google Maps, Waze, and countless others) are a prime example of real-world applications for shortest path problems. There are a few others to consider as well if you aren't convinced yet. ### Motion planning Have you ever used a [flip book](https://www.youtube.com/watch?v=5A0Ro4vj3KM) to animate a scene? How would you animate someone walking in that book? We all have an idea in our head as to what a human looks like when they walk. At each page of the flip book, you're using the path of the limbs to anticipate the next frame. **Flip book animation is like a shortest path problem.** When we flip between frames in a flip book, to get to the next one, we're having our character move in the most natural (i.e. shortest) path from one point in space to the next. If you swing your leg up, it's not going to move erratically. Instead, it will move in a smooth motion, traveling along an arc that is provided in part by the contraction of your muscles and the anatomy of your bones to allow for that motion. ### Communication networks What's the fastest way to send an email? Where's the best place to provide a CDN of your images and JavaScript? These are both shortest path problems. The answers lie in distributed networks such as Amazon Web Services and relay networks of mail servers. This is why, for example, you are asked to choose where you want your servers to live on AWS. If you are starting a blog that caters to local businesses in Boston, it's going to be faster to serve them images and content from the `us-east` region instead of `ap-southeast`. Information travels pretty fast, but even basic spacial reasoning can convince us it will take less time to travel to our Boston customers from servers in Ohio or Virginia than from servers in Singapore or Sydney. ## How do we solve shortest path problems? So given all of these kinds of applications, how would we go about beginning to solve them? **For unweighted graphs, BFS is sufficient.** Since all edges have equal weights, it doesn't matter how we get from A to B, just that we get there in the fewest hops, and BFS will be able to document that for us in `O(n + m)` which is linear and very efficient. But as we saw with MSTs, unweighted graphs aren't very interesting problems. Weighted graphs are much more challenging to solve. BFS is insufficient for solving weighted graphs for shortest paths because **BFS can find _a_ short path but not the optimal shortest path.** This is because BFS could find you the path with the least weight, but requires you to traverse the most number of edges. This is like how when you put into Google Maps for the shortest timed route, it will have you taking all of these weird shortcuts even though you know that there are more direct routes with less turns and stop signs (but probably more traffic). So how _do_ we solve the shortest path problem for weighted graphs? We're going to explore two solutions: **Dijkstra's Algorithm** and the **Floyd-Warshall Algorithm**. ### Dijkstra’s Algorithm Dijkstra's is the premier algorithm for solving shortest path problems with weighted graphs. It's also an example of **dynamic programming**, a concept that seems to freak out many a developer. Dynamic programming is another divide-and-conquer technique where we use the results of a subproblem in order to help answer the general problem we are trying to solve. That's it! Dijkstra's is a dynamic programming application because if we have a path from `s->v->e` where `s` is the starting vertex and `e` is the ending one, we know that there is a middle vertex `v` such that there is a shortest path between `s->v`. In other words, we can step back from `e` all the way back to `s` with subproblems of saying "so if we know want to know the shortest path from `s->e`, can we compute the shortest path from `s->(e-1)`? If we can find that, can we computer the shortest path from `s->(e-2)`?" and so on, until we've reached the shortest path from `s` to it's closest neighbor. _This_ is applying dynamic progamming in the form of Dijkstra's Algorithm. ```js class Graph { // rest of structure from previous articles let ADDED, DISTANCES, PARENTS; dijkstra(startVertex) { ADDED = new Array(this.vertices.length).fill(false); DISTANCES = new Array(this.vertices.length).fill(Number.POSITIVE_INFINITY); PARENTS = new Array(this.vertices.length).fill(-1); DISTANCES[startVertex] = 0; let currentVertex = startVertex, currentEdge; while (!ADDED[currentVertex]) { ADDED[currentVertex] = true; currentEdge = this.connections[currentVertex]; while (currentEdge) { let nextVertex = currentEdge.adjacencyInfo; let weight = currentEdge.weight; if (DISTANCES[nextVertex] > (DISTANCES[currentVertex] + weight)) { DISTANCES[nextVertex] = DISTANCES[currentVertex] + weight; PARENTS[nextVertex] = currentVertex; } currentEdge = currentEdge.nextVertex; } currentVertex = 1; let bestCurrentDistance = Number.POSITIVE_INFINITY; for (let i = 0; i < this.vertices.length; i++) { if (!ADDED[i] && bestCurrentDistance > DISTANCES[i]) { bestCurrentDistance = DISTANCES[i]; currentVertex = i; } } } } } ``` Does this algorithm look familiar? It should. Dijkstra's Algorithm is, in fact, _extremely_ similar to [Prim's Algorithm](/blog/minimum-spanning-trees-in-js/) from the last article. In fact, it is so similar, I only had to change 3 lines (1 of which was the name of the function). The only real difference between Prim's and Dijkstra's is in how they compare distances. In Prim's, we check to see if the next vertex's distance is greater than the current edge weight and if it has been added yet. If it hasn't, then we set the distance to the next vertex equal to that current edge weight and make the current vertex the parent of the next. In Dijkstra's, all we do differently is check to see if the next vertex's distance is greater than _the current edge weight PLUS the distance of the current vertex_. That summed value is what gets added to the distance array for the next vertex, and we add the current vertex to the parent of the next vertex as normal. **In sum, all we are doing extra in Dijkstra's is factoring in the new edge weight and the distance from the starting vertex to the tree vertex it is adjacent to.** The implication here is that Dijkstra's not only finds the shortest path from `s` to `e`, it also finds the shortest paths from `s` to _all other vertices in the graph_. By having to inspect all neighbors at every given step, **Dijkstra can map all shortest routes from all vertices**. Powerful stuff, but at a cost of `O(n^2)` since every vertex is compared to every other vertex. ### Floyd-Warshall Algorithm With most of these graph problems so far, our examples lead us to pick vertices on the outer ends of the graph, much like how we start with the root node of a tree. But what if you wanted to start in the middle? What if you wanted to know the most centrally-located vertex in a graph? In fact, the first example I could think of is Sim City. In Sim City, the "goal" (which I put in quotes because the game is open-ended and has no real objective ending) is to create a vibrant, happy city of people, or "sims." It's essentially one condensed simulation in urban planning. You have to provide people power for their homes, roads for them to travel to work (and places to work), and all of the amenities a local municipality needs like schools, police stations, and parks. But where do you place all of this stuff to make people happy? For many of the buildings, like police stations, they can only operate in a certain radius to effectively stop crime before it's too late. Logically, if you put a police station on the edge of town and someone commits a crime on the other end, it's going to take more time for the police cars to arrive on the scene than if it were centrally located. And since cities in Sim City can be quite large, it's not sufficient to just place one police station in the very middle of the map and hope for the best. You'll need several stations to cover the entire map. And your map, like the real world, is not simply a square grid of flat grass and plains. Natural features like rivers, oceans, and mountains can complicate how a station can effectively police an area. Lucky for you, there is an algorithm called **Floyd-Warshall** that can objectively find the best spot to place your buildings by finding the _all-pairs shortest path_. In other words, at every vertex we can start from we find the shortest path across the graph and see how long it takes to get to every other vertex. Each time we start over, we keep score of the total moves required for each vertex. The shorest average distance will come from that central vertex, which we can calculate with an adjacency matrix. And since we are now adding another layer of checking every vertex on top of what is essentially Dijkstra's, this algorithm runs in `O(n^3)` time. Floyd-Warshall doesn't actually produce a singular return value of the optimal location. Instead, it returns the distance matrix with all of the optimal paths mapped out, which is often sufficient enough for most problems of this scope. Even though cubic time may seem slow, the fact is this algorithm runs fast in practice, in part because it utilizes an adjacency matrix to handle the mapping of all of its distance values (one of the rare instances that we [originally mentioned](/blog/data-structures-for-graphs/) where an adjacency matrix is a better data structure than an adjacency list). It also helps that the algorithm is simple to implement, too: ```js class AdjacencyMatrix { constructor(size) { // ES6 gives us a nice way of filling in a 2D array this.weights = new Array(size).fill( new Array(size).fill(Number.POSITIVE_INFINITY) ); this.vertices = size; } floydWarshall() { let distance; for (let k = 0; k < this.vertices; k++) { for (let i = 0; i < this.vertices; i++) { for (let j = 0; j < this.vertices; j++) { distance = this.weights[i][k] + this.weights[k][j]; if (distance < this.weights[i][j]) { this.weight[i][j] = distance; } } } } return this; } } ``` You can see from the triple-nested `for` loops very clearly that this is indeed a `O(n^3)` worst-case algorithm. This cost is acceptable for finding the all-pairs shortest path, but is also a good candidate for solving what are called _transitive closure_ problems. This is best explained with an example. Who has the most power in a friend group? If mapped on a graph, you might think it's the center of the friend group, because he/she has the most immediate friends (i.e. the most direct connections to other people, or the vertex with the highest _degree_). In fact, it is the person _who has the farthest reach into the entire graph_. The President of the United States is the most powerful person in the world not because he has the most friends, but because he has _the largest, most powerful network at his disposal_. He may not have everyone in his phone, but the people in his phone can eventually connect him to virtually anyone. In fact, there's a popular phenomenon around this very concept of transitives closures called [Six Degrees of Kevin Bacon](https://en.wikipedia.org/wiki/Six_Degrees_of_Kevin_Bacon). It asserts that Kevin Bacon is the most powerful celebrity because "he had worked with everybody in Hollywood or someone who’s worked with them." To prove this statement true once and for all, you could plot every Hollywood celebrity on an adjacency matrix and map their relationships with each other as edges with weights for the strength of the relationship. If Kevin Bacon has the all-pairs shortest path to every other celebrity in Hollywood then this Wikipedia entry is not just a parlor game, but a true account! ## Onto the next Daily Problem > Let `G = (V, E)` be an undirected weighted graph, and let `T` be the shortest-path spanning tree rooted at a vertex `v`. Suppose now that all the edge weights in `G` are increased by a constant number `k`. Is `T` still the shortest-path spanning tree from `v`? ## More problems to practice on 1. Problem 6-15. 2. Problem 6-17. ]]> 2018-10-18T10:28:00+00:00 Minimum Spanning Trees in JavaScript https://www.adamconrad.dev/blog/minimum-spanning-trees-in-js/ Tue, 16 Oct 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/minimum-spanning-trees-in-js/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/dfs-and-topological-sort/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Answers to the previous Daily Problem > Your job is to arrange `n` ill-behaved children in a straight line, facing front. You are given a list of `m` statements of the form _“`i` hates `j`”_. If `i` hates `j`, then you do not want put `i` somewhere behind `j`, because then `i` is capable of throwing something at `j`. We need to order kids in a line such that they don't rip each other to shreds. Seems reasonable. > (a) Give an algorithm that orders the line, (or says that it is not possible) in `O(m + n)` time. So what we want is a valid list out of a graph relationship where all of the students are facing forward (hint, hint: a directed acyclic graph). Solving a DAG to produce a valid list? This sounds exactly what a topological sort would provide. Recall that topological sort on top of DFS is just including an extra stack, and that runtime is `O(V+E)`. Here we mention `n` as the vertices and `m` as the edge relationships between any two children, so we can indeed say that our DFS topological sort will run in `O(m + n)` time. > (b) Suppose instead you want to arrange the children in rows such that if `i` hates `j`, then `i` must be in a lower numbered row than `j`. Give an efficient algorithm to find the minimum number of rows needed, if it is possible. Since we're dealing with rows (or going across), this lends itself to being solved with BFS instead of DFS. Tracking the depth within the tree as you traverse it, and once you find the last tree node, you now know the maximum height of the tree, which therefore indicates the number of rows that are needed. ## Definition of a minimum spanning tree A _spanning tree_ for a graph is the set of edges that connect to all vertices in the graph. In other words, **it's the edges that make the graph fully connected**. So you might think the minimum spanning tree is the minimum set of edges that connect a graph completely. In actuality, **the minimum spanning tree (MST) is the set of edges that connect all vertices in a graph with the smallest weight possible**. When are MSTs useful? **Any time you need to find the minimum-weighted path on a route**, such as the shortest path on a road trip (we'll learn more about shortest path problems in the next article). ### Variants There are a few variations of spanning trees that are similar to MSTs that are also worth noting: 1. **Maximum spanning trees.** As you can imagine, this looks for the path with the heaviest edges to connect the graph instead of the lightest (I'd imagine this would be great for something like Settlers of Catan). You'll see Prim's algorithm ahead for MSTs; to handle maximum spanning trees, just negate the weights to find the heaviest/longest path. 2. **Minimum product spanning trees.** Instead of finding the edge weights with the lowest sum, you want to find the lowest product. To do this, you just add up the logarithms of the weights of the edges instead of just the weights. Why would we do this? Likely for alternative routes that have lower absolute values, sort of like how you can choose the shortest route in time as opposed tot he shortest route in distance. 3. **Minimum bottleneck spanning trees.** This just is another condition of MSTs that we ensure the maximum edge weight is minimized. We can take care of this with Kruskal's algorithm a bit later. 4. **Steiner trees.** A minimum spanning tree with [intermediate midpoints](https://en.wikipedia.org/wiki/Steiner_tree_problem). Wiring routing and circuits deal with this, so if you do any hardware engineering, you might encounter a Steiner tree. 5. **Low-degree spanning tree.** If you're visiting all vertices using a [Hamiltonian path](https://en.wikipedia.org/wiki/Hamiltonian_path) you might construct a low-degree spanning tree, but you can't solve it with the algorithms we're about to show you. The low-degree part just ensures that we aren't visiting hub nodes with a lot of outbound routes, like when you reach a rotary with a lot of exits. ### Prim's Algorithm for MSTs Prim's algorithm is one way to find a minimum spanning tree by starting with a vertex and progressively choosing the best neighboring vertex without regard to the entire structure of the graph. When you do something like choosing the minimum edge weight for a local set of edges without regard to finding the absolute minimum to the whole structure, that is called a _greedy algorithm_. ```js class Graph { // rest of structure from previous articles let ADDED, DISTANCES, PARENTS; prim(startVertex) { ADDED = new Array(this.vertices.length).fill(false); DISTANCES = new Array(this.vertices.length).fill(Number.POSITIVE_INFINITY); PARENTS = new Array(this.vertices.length).fill(-1); DISTANCES[startVertex] = 0; let currentVertex = startVertex, currentEdge; while (!ADDED[currentVertex]) { ADDED[currentVertex] = true; currentEdge = this.connections[currentVertex]; while (currentEdge) { let nextVertex = currentEdge.adjacencyInfo; let weight = currentEdge.weight; if (DISTANCES[nextVertex] > weight && !ADDED[nextVertex]) { DISTANCES[nextVertex] = weight; PARENTS[nextVertex] = currentVertex; } currentEdge = currentEdge.nextVertex; } currentVertex = 1; let bestCurrentDistance = Number.POSITIVE_INFINITY; for (let i = 0; i < this.vertices.length; i++) { if (!ADDED[i] && bestCurrentDistance > DISTANCES[i]) { bestCurrentDistance = DISTANCES[i]; currentVertex = i; } } } } } ``` Using our Big O notation tricks for calculating the runtime, we can see that we need to iterate `n` times for all vertices which sweep across only the minimum `m` edges (which is less than or equal to `n`) to connect the whole graph, which leaves our runtime for Prim's to be `O(n^2)`. Using a priority queue can drop this time down even further to `O(m + (n log n))` (note: can you figure out why?). ### Kruskal's Algorithm for MSTs An alternative greedy algorithm to Prim's is called Kruskal's Algorithm, which just doesn't have a starting vertex. It instead builds up connections via components of vertices and, for sparse graphs, can run faster than Prim's, provided it uses the right data structure. ```js class Graph { let EDGE_PAIRS = new Array(MAX_VERTICES); kruskal() { let counter; let set = new SetUnion(this.vertices); this.toEdgeArray(); // using EDGE_PAIRS // sort by weight rather than a standard array value quicksort(this.EDGE_PAIRS, this.connections.length, this.EDGE_PAIRS.length); for (let i = 0; i <= this.connections; i++) { if (!sameComponent(set, EDGE_PAIRS[i].startVertex, EDGE_PAIRS[i].endVertex)) { console.log(`edge ${EDGE_PAIRS[i].startVertex},${EDGE_PAIRS[i].endVertex} in MST`); set.union(EDGE_PAIRS[i].startVertex, EDGE_PAIRS[i].endVertex); } } } } ``` To summarize the two: 1. _Are you trying to find an MST in a sparse graph?_ **Use Kruskal's Algorithm.** 2. _Are you trying to find an MST in a dense graph?_ **Use Prim's Algorithm.** As you can see, the data structure for fast search of a sparse graph requires something called the Union-Finding data structure, which we will discuss next. ### Union Finding When we're dealing with MSTs, we're thinking about navigating the graph in the shortest/quickest path possible. But to find the best way around our graph, we can't just look at our immediate neighbors to determine the best way forward. It's often helpful to cluster whole "neighborhoods" of nodes together to form a better picture about where we should move next. As we say with Kruskal's Algorithm, one way to do that is by partitioning the graph into distinct sets and sorting against that. To make that happen efficiently, we utilized a `SetUnion` data structure which we will outline below. Set unions have two primary functions: 1. `areSame()` to determine if two vertices are in the same partition 2. `merge()` to merge two partitions together Whereas traditional nodes in a tree focus on moving down the tree and keeping track of children, set unions have nodes who focus on moving _up_ by keeping a close watch on their parents. So if you have a graph of 3 groups like so: ``` 0 3 6 | / \ 1 2 4 \ 5 ``` You could represent the array of parents as `[0, 0, 3, 3, 3, 4, 6]`. A more complete example in JavaScript follows: ```js class SetUnion { constructor(length) { this.length = length; this.parents = [...Array(length).keys()]; this.elements = Array(length).fill(1); } find(element) { if (this.elements[element] === element) return element; return this.find(this.parents[element]); } areSame(set1, set2) { return this.find(set1) == this.find(set2); } merge(set1, set2) { const root1 = this.find(set1); const root2 = this.find(set2); if (root1 === root2) return "already same set"; if (this.elements[root1] >= this.elements[root2]) { this.elements[root1] += this.elements[root2]; this.parents[root2] = root1; } else { this.elements[root2] += this.elements[root1]; this.parents[root1] = root2; } } } ``` As you can see from above, `areSame()` and `merge()` rely on `find()` to recursively traverse the relavent subtree in the graph for the parent that matches. We know from [earlier](/blog/heapsort-priority-queues-in-js/) that recursive partitioning takes `O(log n)` time and so we can conclude that in the worst-case any set union operation will take logarithmic time, which is a great guarantee for Kruskal's Algorithm. Now that we've explored a few algorithms for route finding, let's put some of this knowledge into practice with the Daily Problem. ## Onto the next Daily Problem > Suppose we are given the minimum spanning tree `T` of a given graph `G` (with `n` vertices and `m` edges) and a new edge `e = (u, v)` of weight `w` that we will add to `G`. Give an efficient algorithm to find the minimum spanning tree of the graph `G + e`. Your algorithm should run in `O(n)` time to receive full credit. ## More problems to practice on To get even more practice with graph searching and path finding, here are the other homework problems to go along with this article: 1. Implement an algorithm to print out the connected components in an undirected graph. 2. Problem 6-4. 3. Problem 6-5. 4. Problem 6-9. ]]> 2018-10-16T10:28:00+00:00 Depth-first Search and Topological Sort in JavaScript https://www.adamconrad.dev/blog/dfs-and-topological-sort/ Thu, 11 Oct 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/dfs-and-topological-sort/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/bfs/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Answers to the previous Daily Problem > Prove that in a breadth-first search on a undirected graph `G`, every edge is either a tree edge or a cross edge, where `x` is neither an ancestor nor descendant of `y`, in cross edge `(x,y)`. It helps to provide definitions to the term _tree edge_ and _cross edge_. If you are traveling from `(x,y)` and it's your first time visiting `y`, we say that this is a **tree edge**. As the example mentions above, a **cross edge** is on a path between two points that is neither the ancestor nor the descendant of `x` and `y`. It also helps to define what a _forward edge_ and a _backward edge_ are to give a complete picture. All edges adhere to one of these four types. **Tree edges** only apply to newly-visited vertices. The other three apply to previously-visited verticies. If a visit already happens for the first time, we run through this simple ruleset: 1. If the `y` in this `(x,y)` relationship is an ancestor, it is a **backward edge**. 2. If the `y` is a descendant, it is a **forward edge**. 3. If `y` is neither, that's the **cross edge**. So let's go through all four and eliminate by contradiction. Assume `G` has a backward edge. If a backward edge exists, there is a cycle. If we have a cycle, we have to terminate the search after an edge is processed at most once (otherwise we would loop infinitely on our cycle and we could bounce back and forth between the two vertices involved in the cycle). That means that in this case, the only other possible edges are tree edges. Assume `G` has a forward edge. If a forward edge exists, then `(y,x)` would have already been visited during our search before we reach `x` and try to visit `y`. Since you cannot visit the descendant before you visit the ancestor, no forward edges can exist in an undirected graph. Assume `G` has a cross edge. If a cross edge exists, we're really saying that there are connections between siblings in the tree. BFS operates by going across each layer in the height of the tree. Even though those cross edges aren't formally defined in terms of real connections, they manifest in how our output array of BFS exploration looks. By the process of elimination, the only edge types available left are tree edges, which works just fine. Since BFS is a graph-walking algorithm, the whole point is to visit every node. By visiting a node, those edges are tree nodes their first time around. Since we don't need to visit anything a second time via our queue, our proof is satisfied. Now that we understand a bit more about ancestors and descendants, this should make our implementation of Depth-First Search a bit clearer (Hint: DFS only has tree edges and back edges). ## Depth-First Search: Like BFS but with a stack As I mentioned earlier, the real difference between how BFS and DFS walk a graph is in the data structure they use to make that walk. **BFS uses a queue and DFS uses a stack.** This means that there is some backtracking involved; the algorithm will go as far down the children as it can before it turns back to dive down the next stack of successive child nodes. **This tracking is what makes DFS so powerful.** When remembering depth-first search, just remember that **we're going to visit nodes in a top-to-bottom fashion.** So a tree that looks like this: ``` 8 / \ 6 10 / \ \ 4 5 12 ``` Will read the tree with DFS in this order: `[8, 6, 4, 5, 10, 12]`. The generic algorithm for all graphs (not just trees) using DFS looks like this: ```js class Graph { // ... see last article for our implementation let count = 0; const ENTRY_TIMES = new Array(MAX_VERTICES); const EXIT_TIMES = new Array(MAX_VERTICES); dfs(currentVertex) { if (finished) return; let nextVertex; VISITED[currentVertex] = true; count += 1; ENTRY_TIMES[currentVertex] = count; console.log("PRE-PROCESSED!"); tempVertex = this.connections[currentVertex]; while (tempVertex) { nextVertex = tempVertex.adjacencyInfo; if (!VISITED[nextVertex]) { PARENTS[nextVertex] = currentVertex; console.log(`PROCESSED EDGE ${currentVertex}=>${nextVertex}`); this.dfs(nextVertex); } else if ( (!PROCESSED[nextVertex] && PARENTS[currentVertex] !== nextVertex) || this.isDirected ) { console.log(`PROCESSED EDGE ${currentVertex}=>${nextVertex}`); if (finished) return; tempVertex = tempVertex.nextVertex; } } console.log("POST-PROCESSED"); count +=1; EXIT_TIMES[currentVertex] = count; PROCESSED[currentVertex] = true; } } ``` ### Why use depth-first search? So now we know the _what_ and the _how_ of DFS, but _why_ should we care to use it? Here's a couple reasons: * **Cycle detection.** As we mentioned from the previous Daily Problem, cycles can occur with back edges. Back edges are very easy to detect in DFS because backtracking is built into the algorithm. How long will this take? Only `O(n)` because it will take `n` vertices to find a tree node that has to head back up to an ancestor (via a back edge). Since `n` vertices need to be explored, that's only `n-1` edges to visit to find that back edge and thus the cycle. This reduces down to the worst-case `O(n)` to find the cycle in the graph. * **Dead ends and cutoffs.** Any vertices where cutting them off disconnects the graph is called an **articulation vertex**. DFS can find these in linear time (because of the ability to look back on a parent node to see if connectivity still exists) while BFS can only do this in quadratic time. ### DFS for directed graphs: Topological sort When graphs are directed, we now have the possibility of all for edge case types to consider. Each of these four cases helps learn more about what our graph may be doing. Recall that if no back edges exist, we have an acyclic graph. Also recall that directed acyclic graphs (DAGs) possess some interesting properties. The most important of them is that for a certain configuration, you can represent a DAG as a list of nodes in a linear order (like you would a linked list). Such a configuration (of which more than one can exist for certain DAGs) is called a **topological sort**. Topological sorting is important because it proves that you can process any vertex before its successors. Put it another way, **we can find efficient shortest paths because only relevant vertices are involved in the topological sorting**. ```js class Graph { topologicalSort() { for (let i = 0; i < this.vertices; i++) { if (!VISITED[i]) { // the console logs from dfs() as we pop off stack // will print a topological sort order this.dfs(i); } } } } ``` ## Onto the next Daily Problem Now that we've covered the basics of graph searching, be sure to study this and the previous article to compare and contrast why these implementations work and how they're useful in different situations. Given your knowledge of both now, this next Daily Problem should give you a good idea of how to solve each part: > Your job is to arrange `n` ill-behaved children in a straight line, facing front. You are given a list of `m` statements of the form _“`i` hates `j`”_. If `i` hates `j`, then you do not want put `i` somewhere behind `j`, because then `i` is capable of throwing something at `j`. > > (a) Give an algorithm that orders the line, (or says that it is not possible) in `O(m + n)` time. > > (b) Suppose instead you want to arrange the children in rows such that if `i` hates `j`, then `i` must be in a lower numbered row than `j`. Give an efficient algorithm to find the minimum number of rows needed, if it is possible. ## More practice problems To wrap up this chapter here are the other homework problems to go along with these articles: 1. Problem 5-12. 2. Problem 5-13. 3. Problem 5-14. 4. Problem 5-19. 5. Problem 5-25. Think you've got the answers? Let's see how you do in the next article! ]]> 2018-10-11T10:28:00+00:00 An implementation of Breadth-First Search in JavaScript https://www.adamconrad.dev/blog/bfs/ Thu, 04 Oct 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/bfs/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/data-structures-for-graphs/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Answers to the previous Daily Problem > Present correct and efficient algorithms to convert an undirected graph `G` between the following graph data structures. You must give the time complexity of each algorithm, assuming `n` vertices and `m` edges. Basically we're being asked to take a graph and morph it between the data structures we learned about in the last article. > (a) Convert from an adjacency matrix to adjacency lists. In this case we're going from the space-heavy to the space-efficient, and mapping the relationships. In this case we iterate through every row in our 2D array, which represents each node (read: vertex) in our list. Then all we have to do is, for each node, add the `next` nodes for each pointer that is activated (read: the `1`s in the matrix). That's an `O(n^2)` algorithm since we're doing a constant amount of work for each assignment and we have to compare the `n` columns against the `n` rows. > (b) Convert from an adjacency list to an incidence matrix. An incidence matrix `M` has a row for each vertex and a column for each edge, such that `M[i,j] = 1` if vertex `i` is part of edge `j`, `otherwise M[i,j] = 0`. This one kind of gives it away by saying `for each` twice, since we're walking down the vertices and comparing against the other vertex on that edge giving us an `O(n+m)` algorithm. > (c) Convert from an incidence matrix to adjacency lists. This one is actually interesting. We know we need to iterate across all edges, so that's `O(m)` immediately. The difference is that for every `1` we find in an incidence matrix, we need to find the 2 `n` vertices that represent the two ends of that edge, therefore we need to make 2 operations for each vertex. So this conversion is actually `O(2nm)`, which reduces down to `O(nm)` because as we learned back with our article on [Big O notation](/blog/big-o-notation/), the constants on any variable are dropped. ## Graph traversal definitions Before we dive into the various traversal algorithms, we need to introduce how we track our traversal. This is the same across all algorithms, but we want to define these states since we'll be using them quite a bit: * **Unvisited:** You haven't visited this vertex in the graph yet. * **Visited:** You _have_ visited this vertex. * **Processed:** You not only visited this vertex, but you've travelled down all of its edges, too. ## Breadth-First Search So now that we've described some definitions we'll use, let's look at our first graph traversal algorithm: **breadth-first search** (BFS for short). Later we'll look at **depth-first search**, so remove the confusion now, I want you to think on how you describe something by its _breadth_ versus its _depth_. How do you physically describe it in space? _Breadth_ you probably think in a **left-right relationship**, like extending your arms out to show something broad and wide. The data structure that makes us think of shuffling left-to-right? **A queue.** _Depth_ you probably think in a **up-down relationship**, like the _depths_ of the sea. The data structure that makes us think of diving down and coming back up? **A stack.** So when remembering breadth-first search, just remember that **we're going to visit nodes in a left-to-right fashion.** So a tree that looks like this: ``` 8 / \ 6 10 / \ \ 4 5 12 ``` Will read the tree with BFS in this order: `[8, 6, 10, 4, 5, 12]`. The generic algorithm for all graphs (not just trees) using BFS looks like this: ```js class Graph { // ... see last article for our implementation const PROCESSED = new Array(MAX_VERTICES); const VISITED = new Array(MAX_VERTICES); const PARENTS = new Array(MAX_VERTICES); bfs(startVertex) { let visitQueue = new Queue(); let currentVertex, nextVertex, tempVertex; visitQueue.enqueue(startVertex); visited[start] = true; while (visitQueue.length > 0) { currentVertex = visitQueue.dequeue(); console.log("PRE-PROCESSED!"); PROCESSED[currentVertex] = true; tempVertex = this.connections[currentVertex]; while (tempVertex) { nextVertex = tempVertex.adjacencyInfo; if (!PROCESSED[nextVertex] || this.isDirected) { console.log(`PROCESSED EDGE ${currentVertex}=>${nextVertex}`); } if (!VISITED[nextVertex]) { visitQueue.enqueue(nextVertex); VISITED[nextVertex] = true; PARENTS[nextVertex] = currentVertex; } tempVertex = tempVertex.nextVertex; } console.log("POST-PROCESSED"); } } } ``` The runtime here is `O(n + m)`. Why is that? When using an adjacency list, we're going to explore all of the edges for each vertex. Every vertex will get processed once in the queue. Every edge is going to get walked on both coming and going. That makes the total number of operations `n + 2m`, so when we reduce that down for worst-case runtime, that leaves us with `O(n + m)`. ### Why use breadth-first search? So now we know the _what_ and the _how_ of BFS, but _why_ should we care to use it? Here's a couple reasons: * **Visualize clustered relationships.** A _cluster_ is just a subset of the graph that is connected. Remember, not all vertices need to be connected to each other. Since BFS doesn't care for an absolute root, it will pick any vertex, find all of its connections, and then keep discovering all of the explicitly-defined vertices that haven't already been discovered via connection. BFS is great for mapping people and neighbors/friends across the world. There's going to be a cluster in Australia. There's going to be a cluster in Japan. Each of these islands creates a physical cluster of relationships. They're all part of the total graph of people in the world, but they aren't all necessarily connected. The best part is, no matter how much stuff we do to find connected components, it only ever adds constant time to find clusters, so a connected clustering algorithm still runs in `O(n + m)`. * **The Two-Color problem.** This is one of those specific graph theory math problems that's great for BFS. Each vertex gets a label (in this case, a color). The edges of said vertex cannot lead to another vertex of the same color. So `R->G->R` passes, but `R->R->B` fails. The valid graph is known as a _bipartite_ graph; _bi-_ meaning two and _partite_ meaning to divide. There are lots of real-world examples in [biology and medicine](https://academic.oup.com/gigascience/article/7/4/giy014/4875933). Have you ever played Sudoku? A Sudoku [solver](http://www.cs.kent.edu/~dragan/ST-Spring2016/SudokuGC.pdf) is just an application of graph coloring! In fact, **any sort of tree is a bipartite graph** because the whole purpose of the tree is to _divide_ the nodes involved. ## The next Daily Problem So there's a few clear use cases for BFS. We'll find quite a few more applications for DFS in the next article, but for now, let's think on what we've learned from BFS with a Daily Problem: > Prove that in a breadth-first search on a undirected graph `G`, every edge is either a tree edge or a cross edge, where `x` is neither an ancestor nor descendant of `y`, in cross edge `(x,y)`. ## More problems to practice on Now that [homework 3](http://www3.cs.stonybrook.edu/~skiena/373/hw/hw3.pdf) is out, here are a few problems that are relevant to the sorting we've discussed so far: 1. Problem 5-4. 2. Problem 5-7. That's all for today, stay tuned for depth-first search in the next article! ]]> 2018-10-04T10:28:00+00:00 How JavaScript Engines Work https://www.adamconrad.dev/blog/how-javascript-engines-work/ Wed, 03 Oct 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/how-javascript-engines-work/ 2018-10-03T10:28:00+00:00 Data structures for graphs in JavaScript https://www.adamconrad.dev/blog/data-structures-for-graphs/ Tue, 02 Oct 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/data-structures-for-graphs/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/algorithm-behind-js-array-sort/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). Graphs are the grand unifier of computers, both literally and figuratively. Graphs are the most interesting problems in computer science because they reflect the real-world: networks, communication, and organization all operate in the abstract as a graph. In fact, your computer is reading this article as part of the grand graph known as the internet. Graphs are the analogy for connectivity, so when we study graphs, **we are studying how the people of the world interact with each other.** As we mentioned in a [previous article about designing algorithms](/blog/how-to-design-an-algorithm/), when we describe graphing problems, we often use words like _network_, _circuit_, _web_, and _relationship_. When describing problems with properties of those words, the solution is often an algorithm of navigating a graph. More important than knowing solutions to graph problems is **recognizing the kinds of graphs, their data structures, and the kinds of problems they run into**. ## Kinds of graphs There are a _lot_ of different kinds of graphs. **A graph is just a collection of nodes (or vertices) and the edges that connect them.** Mathematics succinctly defines a graph as `G = (V,E)`, so just like we use [Big O notation](/blog/big-o-notation/) to describe runtimes, so too will we use this graphing notation. But what kinds of graphs will we be dealing with? * **By connectivity.** _Connected graphs_ have all vertices connected to each other. _Disconnected graphs_ have some vertices that have no edges connecting them. Bittorrent networks are connected graphs since each vertex offers a portion of the files they share. Bittorrent networks can also be disconnected graphs if someone has a file that isn't being shared (or desired) by any other computer on the network. * **By direction.** _Directed graphs_ will have edges that force you to move from one vertex to the other in a certain direction. _Undirected graphs_ let you freely travel on an edge in either direction. Two-way streets are undirected. Arteries (vessels bringing blood away from the heart) and veins (vessels bringing blood back to the heart) are directed. * **By weight.** _Weighted graphs_ put a price on the edges you travel between vertices. _Unweighted graphs_ have every edge cost the same. It costs more time (and money) to travel from Boston to Seattle than it does to travel from San Francisco to Los Angeles. It never costs anything to walk from your home to the post office and back, and that distance is always the same. * **By simplicity.** _Simple graphs_ connect to other nodes just once. _Non-simple graphs_ can connect to other nodes with multiple edges, _or even themselves_. Tunnels, like subway systems, are simple graphs because each tunnel edge is meant to get from one place to the other. Roads, on the other hand, are non-simple because there are so many ways to get from one building to the next. * **By density.** _Dense graphs_ have a lot of edges connecting vertices to each other. _Sparse graphs_ have only a few edges connecting vertices. Unlike the other examples, this distinction is kind of vague, but if you can describe the connections in a _quadratic_ order (i.e. every node connected to each other) it's dense. If you can describe the connections in a _linear_ order (i.e. every node is one other node or its neighbors) it's sparse. _Most problems deal with sparse graphs_. Roads are sparse graphs since each intersection is usually only the crossing of two streets. Broadcast networks are dense since a broadcast has to reach all of its subscribers. * **By cycles.** _Cyclic graphs_ will repeat back to one origin vertex at some point. Conversely, _acyclic graphs_ do not. Tree data strutures can be described as _connected, undirected, acyclic graphs_. Adding direction to those graphs (also know as _DAGs_ for short) gives them a _topology_ (i.e. relation or arrangement) between each other. Scheduling problems where _A_ must come before _B_ are represented as DAGs and can be reasoned about with topological sorting algorithms, which we'll learn about a bit later. * **By topology.** _Topological graphs_ have vertices that are related by how their edges are portrayed. For example, if points on a map are shaded and arranged by their distance and height like on a real topography map, that creates a topological graph. Otherwise, _embedded graphs_ represent projections of how real graphs are connected. Any drawing of a graph is an example of an embedded graph. * **By implication.** _Implicit graphs_ have some murky details. They don't fully describe all of the connections in a graph. In fact the traversal of such a graph is how we may uncover a graph to make it explicit. _Explicit graphs_ on the other hand clearly outline all of the vertices and all of the edges from the beginning. If you've ever played Super Metroid, the [map of planet Zebes](http://www.snesmaps.com/maps/SuperMetroid/SuperMetroidMapZebes.html) starts as an implicit graph. The connection between zones is not known until you explore them. Only when you've played the whole game can you create an explicit graph of the game map. * **By labeling.** _Labeled graphs_ give names or keys to each vertex, while _unlabeled graphs_ do not make that distinction. The fact that we can name a path between Boston and Seattle means that the graph of roads between cities (the cities being the label on each vertex) is a labeled graph. ## Data structures for graphs Now that we know the kinds of graphs, how do we represent them in the memory of a computer? A few such data structures arise: * **Adjacency matrix**. If there are `n` vertices in a graph, we can create an `n`x`n` matrix (i.e. an array of arrays) which represents how each vertex on a row is connected to each vertex on a column. A cell with a value of `1` means the row vertex is connected to the column vertex. Otherwise, the cell has a value of `0`. * **Adjacency list**. A more space-efficient way to represent a sparse graph is with a linked list, where the pointers represent the edges that connect neighboring vertices to each other. So we have two major data structures for storing the data of graph problems. How do you know when to use each? ### When adjacency matrices are a better fit * **Testing if an edge is in a graph.** We said indexing from arrays is faster than lists. This is just a multi-dimensional application of this concept. * **When you have really big graphs.** When space isn't a concern, indexing directly into a graph's relationships is going to be faster than having to traverse a list to find vertices deep within a graph. * **When you're focused on writing connections (e.g. insertions and deletions of edges) to the graph.** Since we've represented every connection, it's as simple as toggling a `0` to a `1` which occurs in constant time. ### When adjacency lists are a better fit * **Finding the degree of a vertex.** The _degree_ just means the number of connections to another vertex. Since a linked list inherently stores the pointers to other nodes, it's trivial to find the degree by simply summing the total number of pointers at a given vertex, which takes a constant-time number of operations. A matrix, however, requires summing the number of `1`s in a given row for a vertex of relationships, which involves traversing that array which takes linear time instead of constant time. * **When you have small graphs.** Conversely to a matrix, with small graphs the relationships are simple to map as pointers in a list, which are going to be very fast to access, and much more space efficient. * **When you're focused on reads and traveling around (i.e. traversing) a graph.** Since the data structure of a linked list bakes in relationships between nodes, this is naturally a good data structure for navigating from vertex to vertex. Matrices, on the other hand, verify relationships but offer no hint of where to go next, so it's a chaotic mess that will take at most `O(n^2)` calls on average while a list will do that in linear time. Since most problems focus on solving these kinds of issues, **this is usually the right data structure for solving most graphing problems.** Let's see how these might look in JavaScript: ```js class Vertex { constructor(adjacencyInfo = null, nextVertex = null, weight = null) { this.adjacencyInfo = adjacencyInfo; this.nextVertex = nextVertex; this.weight = weight; } } class Graph { const MAX_VERTICES = 1000; // can be any number constructor(hasDirection) { this.connections = new Array(MAX_VERTICES).fill(null); this.degrees = new Array(MAX_VERTICES).fill(0); this.edges = 0; this.isDirected = hasDirection; this.vertices = 0; } insert(vertexStart, vertexEnd) { let vertex = new Vertex(vertexEnd, this.connections[vertexStart]); this.connections[vertexStart] = vertex; this.degrees[vertexStart] += 1; if (this.isDirected) { this.edges += 1; } else { this.insert(vertexEnd, vertexStart); } } print() { for (let i = 0; i < this.vertices; i++) { console.log(`${i}: `); let connection = this.connections[i]; while (connection) { console.log(` ${connection.adjacencyInfo}`); connection = connection.next; } console.log('\n'); // new line for clarity } } } ``` ## The next Daily Problem To get us thinking a bit about graphs, our next Daily Problem is Problem 5-8 from _The Algorithm Design Manual_: > Present correct and efficient algorithms to convert an undirected graph `G` between the following graph data structures. You must give the time complexity of each algorithm, assuming `n` vertices and `m` edges. > > (a) Convert from an adjacency matrix to adjacency lists. > > (b) Convert from an adjacency list to an incidence matrix. An incidence matrix `M` has a row for each vertex and a column for each edge, such that `M[i,j] = 1` if vertex `i` is part of edge `j`, `otherwise M[i,j] = 0`. > > (c) Convert from an incidence matrix to adjacency lists. Think you have it? Have questions? Send it over [on Twitter](https://twitter.com/theadamconrad) and I'd be happy to help, and good luck! ]]> 2018-10-02T10:28:00+00:00 Verifying request signatures in Elixir/Phoenix https://www.adamconrad.dev/blog/verifying-request-signatures-in-elixir-phoenix/ Sun, 30 Sep 2018 09:10:00 +0000 https://www.adamconrad.dev/blog/verifying-request-signatures-in-elixir-phoenix/ get_req_header("x-slack-request-timestamp") |> Enum.at(0) |> String.to_integer() ``` The local timestamp is kind of weird. The easiest way is to leverage the Erlang API but the numbers start with the start of the Gregorian calendar (so roughly 2018 years ago) rather than the standard UNIX timestamp (Jan 1, 1970). So you'll just have to subtract that difference, which is an oddly specific magic number: ```elixir @unix_gregorian_offset 62_167_219_200 gregorian_timestamp = :calendar.local_time() |> :calendar.datetime_to_gregorian_seconds() local_timestamp = gregorian_timestamp - @unix_gregorian_offset ``` Now you just need to take the absolute value of the difference and make sure it's within some reasonable delta (in the case of the tutorial, it's 5 minutes or 300 seconds): ```elixir if abs(local_timestamp - timestamp) > 300 do # nothing / return false else # process request / return true end ``` ### 3. Concatenate the signature string This is the easiest step because interpolated strings are just as easy as you think they are, and now that you have the raw request body, grabbing this will be trivial: ```elixir sig_basestring = "v0:#{timestamp}:#{conn.assigns[:raw_body]}" ``` ### 4. Hash the string with the signing secret into a hex signature Now you need a signature to compare against the one Slack sends you along with the timestamp. Erlang provides a nice crypto library for computing [HMAC-SHA256 keyed hashes](https://en.wikipedia.org/wiki/HMAC), you'll just need to turn that into a hex digest (using `Base.encode16()`): ```elixir my_signature = "v0=#{ :crypto.hmac( :sha256, System.get_env("SLACK_SIGNING_SECRET"), sig_basestring ) |> Base.encode16() }" ``` ### 5. Compare the resulting signature to the header on the request The light is at the end of the tunnel! There are a few more gotchas that aren't exactly intuitive: 1. **`get_req_header` returns an array, even though the same sounds like it returns the singular value** 2. **Leverage the `Plug.Crypto` library to do secure signature comparisons** As you may have seen from earlier code, `get_req_header` takes in your `conn` and the string of the request header key to return the value...as an array. Not sure why but it's easy to remedy with pattern matching. Finally, to achieve the `hmac.compare` pseudocode from the Slack tutorial, Elixir has an equal `Plug.Crypto.secure_compare`: ```elixir [slack_signature] = conn |> get_req_header("x-slack-signature") Plug.Crypto.secure_compare(my_signature, slack_signature) ``` ## Putting it all together Now that we've solved all the tutorial discrepancies, it's time to put it all together in the context of a Phoenix application. We've already added the custom raw request body header library and modified our endpoint to accept this `Plug.Parser`. Now we just need to bring the verification into our API controller endpoint you specify within your Slack app dashboard: ```elixir defmodule MYAPPWeb.SlackController do use MYAPPWeb, :controller @doc """ Handle when a user clicks an interactive button in the Slack app """ def actions(conn, params) do %{ "actions" => actions, "team" => %{"id" => team_id}, "type" => "interactive_message", "user" => %{"id" => user_id} } = Poison.decode!(params["payload"]) if verified(conn) do # do your thing! conn |> send_resp(:ok, "") else conn |> send_resp(:unauthorized, "") end end defp verified(conn) do timestamp = conn |> get_req_header("x-slack-request-timestamp") |> Enum.at(0) |> String.to_integer() local_timestamp = :calendar.local_time() |> :calendar.datetime_to_gregorian_seconds() if abs(local_timestamp - 62_167_219_200 - timestamp) > 60 * 5 do false else my_signature = "v0=#{ :crypto.hmac( :sha256, System.get_env("SLACK_SIGNING_SECRET"), "v0:#{timestamp}:#{conn.assigns[:raw_body]}" ) |> Base.encode16() }" |> String.downcase() [slack_signature] = conn |> get_req_header("x-slack-signature") Plug.Crypto.secure_compare(my_signature, slack_signature) end end end ``` Congratulations! You can now securely talk with Slack by verifying it's header signature against the one you've generated by hashing your signing secret with the current timestamp. The Slack API documentation is thorough and helpful, but translating it to work in Elixir/Phoenix is not as intuitive as you might imagine. ]]> 2018-09-30T09:10:00+00:00 The algorithms behind the Sort function in JavaScript https://www.adamconrad.dev/blog/algorithm-behind-js-array-sort/ Tue, 25 Sep 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/algorithm-behind-js-array-sort/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/heapsort-priority-queues-in-js/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Daily Problem > You are given a collection of `n` bolts of different widths, and `n` corresponding nuts. You can test whether a given nut and bolt together, from which you learn whether the nut is too large, too small, or an exact match for the bolt. The differences in size between pairs of nuts or bolts can be too small to see by eye, so you cannot rely on comparing the sizes of two nuts or two bolts directly. You are to match each bolt to each nut. Basically this question is asking "does every bolt have a matching nut?" > 1. Give an `O(n^2)` algorithm to solve the above problem. The obvious play here seems to be to test every bolt with every nut, which would look like this: ```js function match(nuts, bolts) { let pairs = []; for (nut in nuts) { for (bolt in bolts) { if (nut.fits(bolt) === "exact match") pairs.push([nut, bolt]); } } return pairs; } ``` There are `n` nuts in a `for` loop checking every `n` bolts in a nested `for` loop, so we can easily determine from our previous discussion on calculating Big O quickly that this algorithm will run in `O(n^2)` time. > 2. Suppose that instead of matching all of the nuts and bolts, you wish to find the smallest bolt and its corresponding nut. Show that this can be done in only `2n − 2` comparisons. We start out with `n` nuts and `n` bolts. Start comparing two at random. Likely they will not be the same size, and we will get feedback on that. If something is `too small` for the other, then we keep that item and toss the other piece. So if the nut was `too small` for the bolt, throw away the bolt, grab another one, and then do another comparison. Essentially you will be ping-ponging between nuts and bolts, throwing away all but the last nut and bolt. Since there are `2n` total items, and we throw away all but the last nut (`1`) and the last bolt (`1`), which is `2n-2` comparisons. What do we do if there is the same? Keep one of either the nut or the bolt aside and do another comparison. If the new nut/bolt is smaller than both, we have our new minimum item and we can throw away both. Regardless, we will get a signal from the new item that will eliminate the kept item or the new item drawn. > 3. Match the nuts and bolts in expected `O(n log n)` time. This was one of those Daily Problems that would benefit from reading ahead a bit. In fact, the answer to these 2 parts will become very clear once we get through a few of the algorithms from this article. ## Mergesort: The divide-and-conquer algorithm behind Firefox array sorting As I mentioned earlier, there is an algorithm that is so good it is used to power the web browser you may be viewing this blog post on. This algorithm is called _mergesort_, and it [was the preferred algorithm](https://bugzilla.mozilla.org/show_bug.cgi?id=224128) for array sorting in Firefox's JavaScript engine over heapsort. Mergesort works by taking advantage of an essential problem solving technique called **divide-and-conquer**. Practically speaking, divide-and-conquer involves **breaking a problem up into a smaller problem to make it solvable.** The way we implement divide-and-conquer with mergesort is to _use the power of recursion to halve our problem space_ until all that's left is a simple two item comparison. Once we compare and sort the subarray with two items, we build those lists back up, _merging_ these subarrays together until we have unified the whole array again, sorting our items as we build the list back up. ```js function merge(leftArr, rightArr) { let mergedArr = []; while (leftArr.length || rightArr.length) { if (leftArr[0] <= rightArr[0]) { mergedArr.push(leftArr.shift()); } else { mergedArr.push(rightArr.shift()); } } while (leftArr.length) leftArr.shift(); while (rightArr.length) rightArr.shift(); return mergedArr; } function mergesort(arr) { return merge( mergesort(arr.slice(0, arr.length / 2)), mergesort(arr.slice(arr.length / 2, arr.length)) ); } ``` ### Why use mergesort over heapsort Truth be told, **mergesort is a very good general sorting algorithm**. Like heapsort, mergesort runs in `O(n log n)` time. If you have the space to copy the array into the sorted array (which you usually do if you aren't in constrained environments like graphics programming or IoT devices), mergesort is great. So if it has the same worst-case runtime as heapsort, why would you choose it over heapsort? **Mergesort is also excellent for linked lists.** The `merge()` operation of mergesort can be implemented without requiring extra space. **Mergesort is stable.** The only swapping of items happens when you actually sort within mergesort. In heapsort, since we're using a priority queue, the structure itself is unstable because it's frequently being rearranged to ensure the maximum/minimum value remains at the root of the heap. ## Quicksort: The randomization algorithm powering Chrome's V8 and IE's Chakra It turns out there are other excellent algorithms for sorting. One such algorithm uses the power of randomization to arbitrarily find a pivot point to sort against. We can use that pivot point recursively just like we do with mergesort to continuously sort smaller and smaller problems until the entire list of items has been sorted. This algorithm is called **quicksort**, and it's used by both [Google](https://github.com/v8/v8/blob/master/src/js/array.js#L328) and [Microsoft](https://github.com/Microsoft/ChakraCore/blob/master/lib/Runtime/Library/JavascriptArray.cpp#L6701) for sorting. Quicksort and mergesort have quite a bit in common. They both build a nested recursion tree all the way down to every `n` items within a list, each of which takes linear time to process. That means they both generally run in `O(nh)`, where `h` is equal to the height of the recursion tree that is generated. Where they differ lies in the randomization of the pivot point. Whereas with mergesort we always subdivide our arrays evenly, in quicksort that point is never known until the function is run. In the best case, we pivot on the median point, halving the array each time leading us to a runtime of `O(log n)` and is on par with heapsort and mergesort. In the worst case, we pivot on the end of the array with a completely unbalanced tree that only ever sorts 1 item at a time giving us runtime of `O(n^2)` and is on par with selection sort. So if the worst case is so bad, why would we bother using quicksort? The answer lies in the average case random sampling. With mergesort the best and only case is the exact median of the list. If the list isn't perfectly _unsorted_, you could be suboptimally generating your recursion sorting tree. Similarly, with premeditated data, we know that we can find a configuration that gets us the best- and worst-case scenarios. But since unsorted data can be in any random order, how does quicksort help us? With quicksort, the middle _half_ of the list (e.g. for `[1,2,3,4]`, that's `[2,3]`) is a good enough pivot to not hit the worst-case runtime. The reason is a pivot somewhere in the middle still generates a recursive tree that is some logarithmic order of magnitude (specifically `2n ln n`), the kind of speed we're looking for. To solidify a `O(n log n)` runtime, we grab a _random_ permutation of our initial list which takes `n` time. And since _random_ data can be sorted on average in `O(log n)`, adding in this randomization step _guarantees_ we can provide an `Θ(n log n)` runtime with _any_ input. ```js function partition(arr, pivotIndex, lower, higher) { let pivotPoint = arr[pivotIndex]; let partitionIndex = lower; for (let i = lower; i < higher; i++) { if (arr[i] < pivotPoint) { [arr[i], arr[partitionIndex]] = [arr[partitionIndex], arr[i]]; partitionIndex++; } } [arr[right], arr[partitionIndex]] = [arr[partitionIndex], arr[right]]; return partitionIndex; } function quicksort(arr, lower, higher) { let pivotIndex = higher; let partitionIndex = partition(arr, pivotIndex, lower, higher); quicksort(arr, lower, partitionIndex - 1); quicksort(arr, partitionIndex + 1, higher); } ``` ### Why use quicksort Now we have three algorithms with an expected runtime of `O(n log n)`. Why would we choose quicksort? **Quicksort performs in-place sorting.** Like heapsort, quicksort requires no extra space. Mergesort requires an additional array to copy your merged items into. If you have space constraints, quicksort beats mergesort. **Quicksort beats mergesort for arrays.** Since quicksort takes advantage of random access, we already know that the constant-time access of arrays by index makes it very easy to use a random number and retrieve an item quickly. With linked lists, the random access no longer pays off, since all items must be accessed from either the front or the back of the list. **Quicksort is great for sorting arrays with hundreds of thousands of unique keys**. In fact, when benchmarked against 100,000 unique keys or more, [quicksort outperformed mergesort and heapsort by 1.5-2x](http://alejandroerickson.com/j/2016/08/02/benchmarking-int-sorting-algorithms.html). ### Revisiting the Daily Problem With the available algorithms at our disposal, this should make the Daily Problem feasible to answer beyond the naive first section. > Match the nuts and bolts in expected `O(n log n)` time. With our randomized quicksort algorithnm in place, we can now sort both nuts and bolts by size, and both sets will hvae a pairing because the first element of one matches the first element of the other, and so on all the way up to the `n`th element. If this still doesn't make sense, check out an [in-depth analysis of randomized quicksort](https://www.cs.cmu.edu/~avrim/451f11/lectures/lect0906.pdf). ## Bucketsort: Sorting via distribution With three sorting algorithms in our optimal band of `O(n log n)` already, how could we possibly get any better? Consider the following example: Rather than a list of numbers, you now have a list of names, like in your iPhone. How do you find a name in your phone? If you scroll really quickly, you'll see a letter highlighting roughly which letter of the alphabet you're in. From there you'll pretty much go through the whole alphabet starting with the second letter of the name until you find the person you're looking for. _This is exactly how bucketsort works (also known as distribution sort or binsort). **Bucketsort is great for evenly-distributed chunks of similar data.** If we used bucketsort in the above example, we would first partition the whole phone book into 26 subarrays (well maybe a few less because letters like _X_ and _Z_ have far fewer names than _M_ or _S_). I hope you're seeing a theme here. We once again are _subdividing the problem_; instead of dividing equally in halves or by a random pivot, we branch a list of `n` items by `k` logical partitions based on how the data is arrange. Now obviously we can't implement a generic algorithm for bucketsort in JavaScript because it highly depends on the dataset. Bucketsort for last names will look very different from bucketsort on tabular accounting data for a business. ## Binary search: The algorithm that powers BSTs Now that we have a gaggle of sorting algorithms to choose from, what can we do with this sorted data to ensure we're still operating efficiently? **Binary search is the `log n` application for searching sorted data.** All we have to do is start with the middle element. If it's not the middle element, we ask "is our target item less than or greater than the middle element?" We then choose a side and repeat the process until our target element is found: ```js function binarySearch(arr, target) { let middle = Math.floor(arr.length / 2); if (arr[middle] === target) return middle; if (arr[middle] < target) { binarySearch(arr.slice(middle + 1), target); } else { binarySearch(arr.slice(0, middle - 1), target); } } ``` ### Meta binary search What happens if you have an unbounded sorted array? For example, we can create an unbounded array using [ES6 generators](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/function*). The fibonacci sequence is unbounded, and can be lazily evaluated step-by-step using generators. How would we find an item within such an array? **Meta binary search**, or one-sided binary search, to the rescue! This search is twice is slow because it requires two `log n` searches. The first search is find our element within some bounds. So we start with a single-element array. Then a two-element array. Then a four-element array. And so on and so on, doubling our bounded subarray until we find our element within some boundary. Once we find the element, we can then proceed with our normal binary search. ### Bisection method / numerical analysis Do you have to divide things in half? The **bisection method** (which you can read more about [here](https://en.wikipedia.org/wiki/Bisection_method)) is a way finding the roots of values by repeatedly _bisecting_ an interval until the target is found. In other words, this is a more generalized form of binary search and can operate on the square root, cubed root, or any root of a number. [Numerical analysis](https://en.wikipedia.org/wiki/Numerical_analysis#Computing_values_of_functions) is deeply rooted in mathematics. **If you're doing graphics programming or machine learning, you may find yourself utilizing the bisection method.** ## Bonus sorts: Shellsort & Radix Sort I wanted to cover two other sorting algorithms really quick because they're super interesting and are great advanced algorithms for specific situations. They aren't really covered in the _Algorithm Design Manual_ but they do have practical applications. **Shellsort** is like an optimized version of either _bubble sort_ (where you continuously swap smaller elements with larger elements) _insertion sort_ (where you insert unsorted elements into the sorted portion of an array), depending on how you look at it. Shellsort sorts in intervals known as _gaps_, and iteratively reduces the number of gaps it sorts by until the whole array is sorted. For example, for an array of a randomized ordering of natural numbers (integers greater than 0) up to 100, shellsort would first sort the numbers [5,10,15,20,...100]. Then it would sort [3,6,9...99]. And that _gap_ interval would reduce down until it got to a gap of 1. But by the time it was sorting against every integer, all of the 5s were sorted, and all of the 3s were sorted, leaving only a fraction of the actual numbers left to sort. Like bucketsort, shellsort requires a custom gap sequence depending on the application of your sorting algorithm. You can also choose either insertion sort or bubble sort as the inner function to perform on your gap subset. You can find an example [here](https://en.wikipedia.org/wiki/Shellsort#Pseudocode), but like I said, it's difficult to really nail in the general sense since it is highly dependent on the data you are sorting. **Radix sort** (of which there are actually two) is arguably one of the fastest sorting algorithms known. If you really dove into the benchmarking link I posted above, it outperforms every other sorting algorithm we've mentioned here so far. Why is that the case? At the end of the day everything is binary. Integers, strings, floating-point numbers, and everything else on a computer is all represent as binary numbers. Because of that, sorting objects by their binary code is a really convenient way to sort objects. Because of that, radix sort is really two algorithms, where you can sort starting on the least-significant digit or the most-significant digit of our object, now represented as a number. One great example of using radix sort (specifically the MSD version) would be to [find the occurrence of a word](https://en.wikipedia.org/wiki/Radix_sort#trie-based_radix_sort) using tries to construct a word dictionary and then depth-first search to find the word you are looking for, starting with the first (or most significant) letter. There are lots of other sorting algorithms that you can explore. As we move on from simple data structures and sorting to graphs and advanced tree searching, we'll demystify what exactly are things like tries and depth-first search. But this section was exhausting, so let's end with the Daily Problem and call it a day for now. ## Even more practice If you just can't get enough of sorting problems, here are some more problems from the book to flex your sorting muscles: 1. 4-20 2. 4-22 3. 4-24 ]]> 2018-09-25T10:28:00+00:00 Why Airbnb has pivoted away from React Native https://www.adamconrad.dev/blog/why-airbnb-has-pivoted-away-from-react-native/ Thu, 20 Sep 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/why-airbnb-has-pivoted-away-from-react-native/ 2018-09-20T10:28:00+00:00 Heapsort using Priority Queues in JavaScript https://www.adamconrad.dev/blog/heapsort-priority-queues-in-js/ Thu, 20 Sep 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/heapsort-priority-queues-in-js/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/intro-to-sorting/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Daily Problem Explained > Give an efficient algorithm to determine whether two sets (of size `m` and `n`) are disjoint. Analyze the complexity of your algorithm in terms of `m` and `n`. Be sure to consider the case where `m` is substantially smaller than `n`. This is just another exercise in writing pseudocode; a great problem for a programming interview! The only real catch here is you have to know what a [disjoint set](https://en.wikipedia.org/wiki/Disjoint_sets) is. A **disjoint set** is really just _two collections that have no items in common_. An example for this problem would be _`m = [1,2,3], n = [4,5,6]`_. Let's just start throwing stuff out there and see what sticks. My first thought is that when you combine set of two disjoint sets is of length `m + n`, since no duplicates exist. That's pretty important because if your final set `S` is such that `S.length < m.length + n.length` you fail the disjoint part of the problem. Can we exploit that? Here's what I'm thinking: ```js function isDisjoint(m, n) { let finalSet = m; for (let item in n) { if (!finalSet.includes(item)) finalSet.push(item); } return finalSet.length === (m.length + n.length); } ``` On first glance you only see 1 `for` loop so it looks like it's `O(n)` which is fast! But then we quickly realize the absolute fastest we can go is `O(n+m)` since we have to evaluate all items to duplication, so something is up. T The key here is that `includes` has to inspect the entire `m` list for each `n` items. That leaves us with an algorithm of `O(mn)`. It's not bad but it's not great either; remember, that's roughly equivalent to `O(n^2)` (assuming `m` and `n` were the same length) which is quadratic. Quadratic functions are fine for relatively small data sets, but as we noted [in our article on Big O Notation](/blog/big-o-notation/), once we go above a million items in our dataset, the algorithm will be too slow. Can we improve on this? One thing I quickly realized is we can leverage what we've learned so far as a hint on how to solve this. What data structures have we been studying carefully? Which of those are praised for their performance? Then it hit me: **we can use a hash table to keep track of the number of times we've touched an item and if any already exist we know right away the set is not disjoint and we can exit early**. Adding to and searching from a hash table takes an expected `O(1)` time, `O(n)` at worst. That means we can hit that magical runtime of `O(m+n)` in the worst-case using a hash table! ```js function isDisjoint(m, n) { let finalSet = new HashTable(); for (let item in m) { finalSet.insert(item); } for (let item in n) { if (finalSet.search(item)) { return false; } else { finalSet.insert(item); } } return true; } ``` ## Standard Selection Sort in JavaScript The first sorting algorithm we're going to look at is **selection sort**. Selection sort's algorithm is very simple: _continue to take the smallest item from one array a place it into a new array_. By virtue of taking the smallest item from a recursively-shrinking array, you're left with a new array that is sorted. A naive implementation using standard JavaScript library functions looks like this: ```js function selectionSort(arr) { let sorted = [], min; for (let i = 0, n = arr.length; i < n; i++) { min = Math.min(...arr); sorted[i] = min; arr.pop(min); } return sorted; } ``` As we can see from the algorithm above, selection sort performs `n` iterations, but each iteration will take roughly `n/2` steps in order to find the minimum value (we know this because finding a minimum value takes at worst `O(n)` but our problem space is being reduced with each successive iteration), giving us a grand total **worst-case runtime of `O(n^2)`**. Is there a way we can improve on this runtime? ## Selection Sort + Priority Queues = Heapsort Look back to our previous algorithm - what are the primary operations for selection sort? Each iteration consists of a `min()`, an `insert()`, and a `delete()`. There is, in fact, a special data structure that combines `delete()` with `min()` to _efficiently_ extract the minimum value from its collection. This data structure is called a **priority queue**. ### Priority queues Priority queues are flexible data structures that _allow for elements to enter the queue at arbitrary intervals while exiting in a sorted fashion_. Priority queues support three primary operations: _insert_, _findMin_ (or _findMax_), and _extractMin_ (or _extractMax_). For the purposes of simplicity going forward, we're going to focus on priority queues that implement the _minimum_ rather than the _maximum_ value. A summary of their worst-case runtimes can be found below. **As an exercise: can you explain how these values are calculated?** Here's a hint: `findMin()` is always constant-time complexity because you can always use a variable to keep track of the minimum value as new values are inserted.
Data structure / method findMin() extractMin() insert()
Unsorted array O(1) O(n) O(1)
Sorted array O(1) O(1) O(n)
BST O(1) O(log n) O(log n)
### Implementing `extractMin()` and `insert()` with heaps As I mentioned earlier, there is a special data structure that can efficiently handle the primary operations of selection sort. This data structure is known as a **heap**. Heaps are a variant of a priority queues that keep a rough track of the order of its items. This **ordering is smart enough to keep track of the minimum/maximum value, but not a complete sort which would significantly slow down the data structure**. To implement a heap, we are effectively creating a _binary tree without worrying about pointers_, or in other words _an array of keys where the index of the array item _is_ the pointer_. The root of the tree is the first item in the array (and also the minimum value), with its left and right children corresponding to the 2nd and 3rd values in the array. In general, it looks something like this: ```js class PriorityQueue { const PRIORITY_QUEUE_SIZE = 500; //arbitrarily picked constructor(arr, n) { this.queue = new Array(PRIORITY_QUEUE_SIZE); this.length = 0; for (let i = 0; i < n; i++) { this.insert(arr[i]); } } parent() { if (this.length === 0) return -1; return Math.floor(this.length / 2); } child(n) { return 2 * n; } } ``` Remember: we don't care about what's inside of each slot in our priority queue, we just care about finding elements at specific locations. That is why we can calculate locations with just some multiplication and division rather than having to search by value. Because we're not pointing to any specific values, the index of our elements is a reflection of the math we are performing, and when we perform basic operations like `*` and `/`, we know we're dealing with fast computational operations in constant time. The downside to a heap is that we can't really do an efficient search in a heap. Because we're only concerned about keeping the smallest item at the root, everything else is arbitrarily packed into the tree. The tree items are packed to maintain space efficiency, but remember this isn't a binary _search_ tree, just a binary tree (i.e. each node has two children). #### Insertion As we mentioned earlier, save for the smallest item, the heap is just packed in from the top down, filling in from left to right before moving down to the next layer of the tree. This keeps the heap efficiently packed. To ensure the minimum-dominance relationship is maintained, we have to make sure we can swap up the tree as far as necessary to satisfy our minimum constraints. ```js class PriorityQueue { // ... insert(item) { if (this.length >= PRIORITY_QUEUE_SIZE) { throw new Error("Priority queue overflow"); } else { this.length += 1; this.queue[this.length] = item; this.bubbleUp(this.length); } } bubbleUp(parent) { if (this.parent(parent) === -1) return; let currentParrent = this.queue[this.parent()]; let parentContender = this.queue[parent]; if (currentParrent > parentContender) { [currentParent, parentContender] = [parentContender, currentParrent]; this.bubbleUp(parent); } } } ``` Analyzing the above two functions, we see that `insert()` has no loops and emulates a tightly-packed binary tree. We know the height of a binary tree is `lg n` (shorthand for `log n` of base 2), so our worst-case runtime is `O(log n)`. The `bubbleUp()` has the primary function of swapping elements (remember from ES6 the swap appears as `[x,y] = [y,x]`) which takes constant time. Therefore, to create a heap with `n` items it will take `O(n log n)` to insert and construct. #### Extracting the minimum value Finding the minimum value is easy, since it is the root of the tree. Actually pulling it out of the array is the hard part, since this leaves the tree broken in two at the root. To handle this, we'll need to re-assign the right child of the root's children to be the new root. Why the right-most element? Because _we pack from left to right, so we want to unfurl our array in a clean manner from the opposite direction_. If the value isn't correct in the minimum-dominance relationship, we can always bubble _down_ the array (or "heapify") to swap out for the correct minimum. ```js class PriorityQueue { // ... extractMin() { let min = -1; if (this.length <= 0) throw new Error("No minimum: Empty priority queue"); min = this.queue[0]; this.queue[0] = this.queue[this.length]; this.length -= 1; bubbleDown(1); return min; } bubbleDown(parent) { let childIndex = child(parent), minChild = parent; for (let i = 0; i <= 1; i++) { if (childIndex <= this.length) { if (this.queue[minChild] > this.queue[childIndex + i]) { minChild = childIndex + i; } } } if (minChild !== parent) { [minChild, parent] = [parent, minChild]; bubbleDown(minChild); } } } ``` Similarly for extracting the minimum value, bubbling down only goes down a tree height of `lg n`, so at worst finding and removing the minimum value will take `O(log n)` time. ## Selection Sort using Priority Queues as Heaps Putting it all together, we now have our optimized selection sort, known as _heapsort_, which combines everything we've learned here into an efficient extraction machine: ```js function heapsort(arr, n) { let heap = new Heap(arr, n); for (let i = 0; i < n; i++) { arr[i] = heap.extractMin(); } return arr; } ``` How elegant! And the best part: since our priority queue's operations run in `O(log n)` time, over a set of `n` elements, **heapsort improves on selection sort from `O(n^2)` to `O(n log n)`**, the gold standard for sorting algorithm time efficiency! Even better, we did it _in-place_, meaning we didn't have to instantiate a second array to copy the elements over. We were able to use the same array to swap elements and create our sorted list of items. Overall, this meaps heapsort an excellent algorithm for both its space and time efficiency. Now there is a _slightly_ more advanced way of designing the constructor such that we populate the first `n` positions in heap directly from our input array (meaning we've created a one-sided heap) and then use a bunch of `bubbleDown()` calls (`n/2` to be exact since half of the items are already on the correct side of their parents) to balance out the array and find our minimum. This speeds up construction from `O(n log n)` to `O(n)` on average, but since construction is not the main operation of a heap, and the worst-case operations still take `O(n log n)`, this isn't too big of a win, **but you're welcome to implement it for yourself if you're looking for an extra exercise in algorithm design**. ## Wrapping up with the next Daily Problem We've covered a lot with selection sort, priority queues, and heapsort. We've seen we can pretty easily cover sorting in `O(n^2)` time, but with a few optimizations, we can improve that time to the optimal `O(n log n)`. This will be useful in tackling today's Daily Problem: > You are given a collection of `n` bolts of different widths, and `n` corresponding nuts. You can test whether a given nut and bolt together, from which you learn whether the nut is too large, too small, or an exact match for the bolt. The differences in size between pairs of nuts or bolts can be too small to see by eye, so you cannot rely on comparing the sizes of two nuts or two bolts directly. You are to match each bolt to each nut. > > 1. Give an `O(n^2)` algorithm to solve the above problem. > > 2. Suppose that instead of matching all of the nuts and bolts, you wish to find the smallest bolt and its corresponding nut. Show that this can be done in only `2n − 2` comparisons. > > 3. Match the nuts and bolts in expected `O(n log n)` time. Think you have an answer? Create a [gist](https://gist.github.com) and send it to me [on Twitter](https://twitter.com/theadamconrad) and let's compare notes! ## More problems to practice on Now that [homework 2](http://www3.cs.stonybrook.edu/~skiena/373/hw/hw2.pdf) is out, here are a few problems that are relevant to the sorting we've discussed so far: 1. 4-1 2. 4-2 3. 4-5 4. 4-6 5. 4-12 6. 4-13 7. 4-14 8. 4-15 ]]>
2018-09-20T10:28:00+00:00
Introduction to sorting algorithms in JavaScript https://www.adamconrad.dev/blog/intro-to-sorting/ Tue, 18 Sep 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/intro-to-sorting/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/hashing-in-js/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). Over the next several articles in this series, we are going to implement many of these basic sorting algorithms in JavaScript so you can compare and contrast the approaches and even find which are used in the JavaScript language today. ## Answers to the Daily Problem This round our Daily Problem is back to coming directly from the book, practice problem 4-3: > Take a sequence of `2n` real numbers as input. Design an `O(n log n)` algorithm that partitions the numbers into `n` pairs, with the property that the partition minimizes the maximum sum of a pair. For example, say we are given the numbers `(1,3,5,9)`. The possible partitions are `((1,3),(5,9))`, `((1,5),(3,9))`, and `((1,9),(3,5))`. The pair sums for these partitions are `(4,14)`, `(6,12)`, and `(10,8)`. Thus the third partition has 10 as its maximum sum, which is the minimum over the three partitions. There's a lot of words here but we can reason about this another way: the pattern here is to minimize the largest sum is pairing the maximum number with the minimum number, and working inward. We first need to sort (which takes `n log n`) the numbers, and then pair them off by taking the `maximum()` and the `minimum()` number from the set, plucking them out of the old set and placing them into a new set. Finally, we take the largest number from those pairs and that's the maximum sum that is the minimum across all possible permutations of the problem. In pseudocode, it looks something like this: ```js function minimumMaxSum(set) { let sortedSet = set.sort(); let pairs = []; while (sortedSet) { let maximum = sortedSet.maximum(); let minimum = sortedSet.minimum(); pairs.push(maximum + minimum); sortedSet.remove(maximum); sortedSet.remove(minimum); } return pairs.maximum(); } ``` ## Applications of sorting If you're wondering why sorting gets so much attention in interviews, it's because sorting is so important and so common. In fact, it is reported that [nearly a quarter of all computational time](https://www.johndcook.com/blog/2011/07/04/sorting/) is spent in just sorting data. Because it's so common, we have developed dozens of algorithms to tackle this problem, and is in fact one of the most studied problems in computer science. Here are just a few of the common use cases for sorting: * **Searching:** If you want to find stuff, it's going to be way easier to find stuff if it's organized. Can you imagine searching for books in a library if everyone just stuffed their books in there arbitrarily? * **Frequency / Uniqueness:** Remember things like _mean_, _median_, and _mode_ from math class? It's much easier to find the _mean_ and _median_ if you know they are roughly in the middle of your set. Similarly, the _mode_ will always be grouped together in the largest group of a set; all made easier by sorting. * **Selection:** Like search, picking something out of a set is much easier when it's in a group of sorted items. Otherwise, the thing you want to pick is in a random group and may take longer to find. ## Approaches to sorting When you think of sorting items, what comes to mind? Probably something like largest to smallest (or vice versa), and probably involving numbers. But lots of data can be sorted in lots of different ways, so before we dive into sorting algorithms it's important to remember that this is not the only way to sort. Here are a few more examples to consider when using sorting algorithms: * **By key/record:** Just because we have the largest/smallest piece of data doesn't mean it is special. Sometimes it has to do with when/where that data was entered. When we enter data into a database with a unique incremental ID, we can sort by created-at date using the key of that record. * **By alphabetical ordering:** Numbers increase and decrease, but so do letters based on their ordering in an alphabet. Even though there is no mathematical basis for A to be "less than" Z, there is a semantic and contextual basis. * **By frequency:** Even if something isn't the newest or the largest, we may want to have the most popular/frequent data at the top to show to customers. Grouping by frequency and sorting on that can also be very important in picking out outliers. Just like applications, the approaches to sorting are also numerous. What's important is not that this list is exhaustive, but that it is a _starting point for thinking about sorting as a fundamental activity of computer programming_. Whether or not you're a web programmer or an AI programmer, you _will_ have to sort some data while doing your job. Knowing how that works under the hood is advantageous, particularly if you're needing to optimize your programs for performance. **One caveat: just because you should learn how this works doesn't mean you should implement your own sorting algorithms in a professional setting.** This is important because while programming interviews may ask you to implement algorithms to commonly-solved problems, that doesn't mean you should _actually_ use them in the real world. What is important is that you know _how_ it works, not that you need this in your arsenal of tools to implement. Next time, we'll look at one of these high performers that uses priority queues, a special variant on queues that we'll explore in-depth while we spell out our first sorting algorithm. ## Next Daily Problem This one is pretty straightforward question, sort of something like you'd see out of a programming interview! > Give an efficient algorithm to determine whether two sets (of size `m` and `n`) are disjoint. Analyze the complexity of your algorithm in terms of `m` and `n`. Be sure to consider the case where `m` is substantially smaller than `n`. Think you have one in mind? Throw it up on a [gist](https://github.com/gist) and send it to me [on Twitter](https://twitter.com/theadamconrad)! ]]> 2018-09-18T10:28:00+00:00 Hashing in JavaScript https://www.adamconrad.dev/blog/hashing-in-js/ Thu, 13 Sep 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/hashing-in-js/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/dictionaries-in-js/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Answers to the Daily Problem To solidify our understanding of dictionaries, the Daily Problem focuses on utilizing the primary functions of dictionary objects: > You are given the task of reading in `n` numbers and then printing them out in sorted order. Suppose you have access to a balanced dictionary data structure, which supports each of the operations search, insert, delete, minimum, maximum, successor, and predecessor in `O(log n)` time. So basically it's asking us to use some creative ju-jitsu with dictionaries in various situations. > 1. Explain how you can use this dictionary to sort in `O(n log n)` time using only the following abstract operations: minimum, successor, insert, search. Sorting in `n log n` is a pretty common worst-case for good sorting algorithms. It basically means we have to touch all `n` numbers and then divide and conquer the sorting part (which is `log n`). With these 4 functions available, we can sort with something like this: The book does this slightly differently - first it adds all of the numbers to the list, then it sorts it using the `successor()` function, but I'm assuming we already have a list that we can copy into a new one by starting with the `minimum()` object and inserting it, _presorted_, into the new dictionary. ```js function sort(n) { let m = 0, min = minimum(n), newDict = new Dictionary(); while(m !== n) { newDict.insert(min); min = n.successor(min); m++; } return newDict; } ``` > 2. Explain how you can use this dictionary to sort in `O(n log n)` time using only the following abstract operations: minimum, insert, delete, search. In this case, we keep stripping off the smallest item from the old list of `n` numbers. Each time we grab the minimum, we add it into the new dictionary, and then delete it from the old, unsorted dictionary. This is actually a perfect example of what the book refers to as _recursive data structures_. When we remove one item from the dictionary, we are left with just a smaller dictionary. We can exploit this by recursively grabbing the smallest item and adding it to our new dictionary to create a sorted object. ```js function sort(n) { let min, newDict = new Dictionary(); while (n) { min = minimum(n); newDict.insert(min); delete(n, min); } return newDict; } ``` > 3. Explain how you can use this dictionary to sort in `O(n log n)` time using only the following abstract operations: insert and in-order traversal ```js function sort(n) { let newDict = new Dictionary(); for (let i = 0; i < n.length; i++) { newDict.insert(i); } console.log(newDict.inorderTraverse()); return newDict; } ``` This is the best I could think of, I actually don't really understand why the book thought this was a good solution. The idea is to copy all of the items into the new list, and then do the sorting by printing out the traversal of all items in-order. This is more congruent with the rest of their sorting algorithms, whereby they insert stuff first, and then handle the sorting part. This isn't really sorting so much as it is taking advantage of using a dictionary like a Binary Search Tree and then traversing by binary search. ## Arrays + Linked Lists = Hash Tables While dictionaries can be implemented by using arrays or linked lists, we can get significant efficiency improvements by implementing dictionaries using arrays _and_ linked lists. The key to connect those two is to use _hashing_. A _hashing function_ is just a way to translate something into a numeric representation. There is [no one singular good hash function](https://en.wikipedia.org/wiki/List_of_hash_functions) (e.g. [SHA-256](https://en.wikipedia.org/wiki/SHA-2), [BSD checksum](https://en.wikipedia.org/wiki/BSD_checksum), [Rabin fingerprint](https://en.wikipedia.org/wiki/Rabin_fingerprint)), but what matters is that **it's cheap to calculate and it's evenly distributed**. The number from that function can then be used as the index in an array. ```js class HashTable { ENGLISH_ALPHABET_LENGTH = 26; TABLE_SIZE = 500; hash(str) { let sum = 0; for (let c = 0, len = str.length; c < len; c++) { sum += Math.pow(ENGLISH_ALPHABET_LENGTH, len-c+1) * char(c); } return sum % TABLE_SIZE; } } ``` So that's a sample hashing function to translate letters into really big numbers. Why are we making the number super big? So there's a high chance that the key (the word that represents something like an array index for our dictionary) we are translating will be a unique number. Now we can expand on our `HashTable` to start implementing our dictionary functions! ```js class HashTable { // assume we have the hashing function and constants // that we've already implemented constructor() { this.table = new Array(TABLE_SIZE); } search(item) { let list = this.table[hash(item.key)]; return list ? list.search(item.value) : null; } insert(item) { let idx = hash(item.key); let list = this.table[idx]; if (list) { list.insert(item.value); } else { list = new LinkedList(); list.insert(item.value); this.table[idx] = list; } } delete(item) { let idx = hash(item.key); let list = this.table[idx]; if (list) { // delete may have to copy list // in the event the linking is broken list.delete(item.value); } } } ``` Now you can see how the two data structures make a hash table a great implementation of a dictionary! We get the fast search benefits of an `Array`, with the faster inserts/deletes and safety of a `LinkedList` in the event that multiple strings happen to map to the same hashing function. That means **all of these operations are expected to run in constant `O(1)` time, and `O(n)` in the worst-case**, the difference being the event of a hash collision. If we have no or few collisions, aided by an excellent hashing function, then the linked lists are never filled with more than item, so search and modification is instant. If we have a bad hash function or our luck is bad (_luck_ meaning that we happen to choose words whose hashing functions all result in the same number) then we'll have to operate on just one long linked list which is no different than the `O(n)` runtimes we calculated for lists. ## Practical applications for hash functions and hash tables With a `HashTable` implementation in our repertoire, we can start to identify use cases when this data structure is good for us to utilize. In general, **hash tables and hashing functions help speed up algorithms that would take `O(n log n)` or `O(n^2)` and turn them into linear time (`O(n)`) algorithms**. ### Pattern matching strings with Rabin–Karp _Strings_ are an array of characters with the order preserved so we can express things like words in a sentence. We can find patterns in our words (like substrings, palindromes, etc.) using searching algorithms that make use of a hashing function. One such algorithm is _Rabin–Karp_. ```js function rabinKarp(str, pattern) { let hashPattern = hash(pattern); let patternLength = pattern.length; for (let i = 0, len = str.length - patternLength; i++) { let substring = str.substring(i, i+patternLength-1); let hashString = hash(substring); if (hashString === hashPattern) { if (substring === pattern) { return i; } } } return false; } ``` The idea here is that we take advantage of the fact that our hash functions return unique but predictable numbers for strings of words. That means that **if we find the same big number inside of our string as the hash number of our pattern, it's probably a match**. I say "probably" because as we know from our hash table implementation, there is a slight possibility of hash collisions that could generate false positives. Luckily, this is exceedingly rare as the length of our table is large and a prime number (primes of course can only be divided by 1 and themselves, making them harder to have collisions since other numbers can't divide into the same answer). ### Duplication Since Rabin–Karp helps us detect patterns within a string, we can apply this algorithm to a variety of real-world scenarios that exploit the fact that duplicate hash values indicate that strings are the same: 1. **Unique web pages on the web.** If web pages are just one long document of words and links, then the entire text of the HTML document can be represented as one giant number (via hashing, of course). For crawlers like Google, rather than having to slowly crawl all of the words of a document to verify it is unique, it just has to see if the hash value is in its mega list of hashes. If it isn't we know we have a new web page to add to the crawler. 2. **Plaigarism.** Copying text can be easy to detect because the hashes should be the same, just like crawling web pages. The only problem is adding in a few filler words, some spaces, different formatting… with a few small changes you can create a unique hash and make it impossible to compare one document to the next. Instead, we can leverage pattern matching by scanning the document in chunks and seeing if the vast majority of sub-hashes are the same. This is obviously a downside because it requires significantly more space to store and compare those hashes, but this is still smaller than having to search against every single word betwee the two documents. 3. **Save when changed.** Just as we know that a hash function can numerically represent an entire HTML document, we can hash an entire file with a cryptographic hash to ensure that the document has not been modified by an unauthorized third party. You ever see messages from your laptop warning you that "this third-party application has not been signed?" This is exactly what this is for! Developers send over a hash with their saved changes to prove they were the last ones to touch their programs. They then send those hashes either to app stores or directly to their customers to let them know that their programs haven't been hacked by outside developers, ensuring that their programs can be safely run on other machines. As you can see, hashing, whether with a table or as a simple function, is a very practical mechanism for dealing with human language data in the form of strings. Strings are the fundamental formal translation of words into arrays, and the hashes are the translators that turn these characters into numbers. This makes dealing with lots of string data very efficient when we hash them out into numeric data. ## The next Daily Problem (4-3) This round our Daily Problem is back to coming directly from the book, with a preview into our next segment, sorting: > Take a sequence of `2n` real numbers as input. Design an `O(n log n)` algorithm that partitions the numbers into `n` pairs, with the property that the partition minimizes the maximum sum of a pair. For example, say we are given the numbers `(1,3,5,9)`. The possible partitions are `((1,3),(5,9))`, `((1,5),(3,9))`, and `((1,9),(3,5))`. The pair sums for these partitions are `(4,14)`, `(6,12)`, and `(10,8)`. Thus the third partition has 10 as its maximum sum, which is the minimum over the three partitions. ]]> 2018-09-13T10:28:00+00:00 Dictionaries and Binary Search Trees in JavaScript https://www.adamconrad.dev/blog/dictionaries-in-js/ Tue, 11 Sep 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/dictionaries-in-js/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/basic-data-structures-in-js/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Answers to the Daily Problem This problem basically asked us to combine what we learned from [Big O Notation](/blog/how-to-analyze-js-functions/) and apply that to one of the fundamental [data structures](/blog/basic-data-structures-in-js/) we saw in the last article. > For each of the four types of linked lists (singly unsorted, singly sorted, doubly unsorted, doubly sorted), what is the asymptotic worst-case running time for each dynamic-set operation listed? If you've been following along with this series, this should be pretty easy to figure out. > a) Search(List, item) Drawing on our playpen slide tube analogy, we know that we can only enter our `LinkedList` from either ends of the slide. The worst-case is that we enter the wrong end of the slide, and our target is all the way at the other end. So if we have to go through the whole list of `n` items, that means **our worst-case runtime for search is `O(n)` across all 4 kinds of lists**. > b) Insert(List, item) One of the benefits we mentioned in the previous article was that insertions and deletions were simpler because we can just tack on stuff to the ends of the list. Since we always have a pointer to the `head` of any kind of linked list, it only requires **constant-time access (`O(1)`) to insert an item onto any unsorted linked list**. Why does sorting hurt us? Because once the list is sorted, _we can no longer tack on the item to the beginning of the list_. Now the list has some logical ordering behind it, we have to maintain that ordering, which could mean having to scan the entire list to find the exact right place to insert our new item, making **insertions a linear-time runtime (`O(n)`) for sorted linked lists.** > c) Delete(List, item) To limit confusion, let's right off the bat say **doubly-linked lists can implement this `O(1)` runtime because they have access to both the previous and next node**. Since you need both to connect them, and thus delete the active `item` by disconnecting it from the list, we're done here. However, for singly-linked lists, this one is interesting because it depends on the algorithm you want to implement. If you want to delete the item _after_ `item`, this is a straightforward **worst-case for delete is `O(1)`** since you just connect `item.next` to the next of the next. But if you want to delete the `item` provided, you need to find the previous node, and the only way to do that is to start from the beginning and work your way down until you see that the current node's `next` is equal to the `item` you specified. Only then can you connect `current` to the `next` of the next node, meaning **in the worst-case, deletes can take up to `O(n)` for singly-linked lists**. > d) Successor(List, item) This function just asks you to find the next logical item in the list. Logical in this case means that if you have a list that would be represented by an array as `[1,2,3]`, the _successor_ to `2` is `3`. Since logical ordering is based on sort order, this is entirely dependent on if the list is sorted or not, and nothing to do with being singly- or doubly-linked, since both of those implement a `next` property. **For sorted lists, this is just an access to `next` and runs in `O(1)`, while unsorted lists may require viewing the whole list to finding the logical next item, and thus runs in the worst-case `O(n)`**. > e) Predecessor(List, item) Finally, we have something unique to tackle! This function is the first place we see fragmentation between singly- and doubly-linked lists. **For sorted doubly-linked lists, this, too, is `O(1)` because the previous node is a property inherent on the list object, and property access happens in constant time**. This doesn't apply for unsorted doubly-linked lists, because like our quandry with question D, it could take viewing the entire list to find our logical predecessor, and therefore **unsorted doubly-linked lists take `O(n)` to run the predecessor function.** For singly-linked lists, this presents a problem. If you only ever have access to `next`, how can you possibly know what the `prev` item is? Another way to look at that is to flip the script: rather than looking at the predecessor as the previous node of the current item, we instead want to look for _the previous node's next node_. Yes, see, because we know that the previous node's next node is the same as the current node (`item`) that we're passing in! But to do this we have to start from the very beginning of the list. And if our current `item` is all the way at the end of the list, we'll have to search the entire list of `n` items until we find the predecessor, therefore **singly-linked lists can calculate the predecessor in `O(n)` time in the worst-case, regardless of whether or not they are sorted**. > f) Minimum(List) Here we just want to find the minimum item in the list. Whether that's the lowest ordered letter in a list of letters in the alphabet, or the lowest number in the list, we want to find the minimum value. Now it should start to make sense as to why there were 4 list types, one group centered around sorting. When the list is not sorted, you really have no idea where the minimum item could be, so **in the worst-case scenario you have to search the entire list to find the minimum item, which is `O(n)`**. However, if the list is sorted, you know that it is going to be at the very start of the list, which you already have access to in both singly- and doubly-linked lists, and therefore **the minimum value can be found in a sorted linked list in constant `O(1)` time**. > g) Maximum(List) Finding the maximum is identical to finding the minimum, but not exactly for the reasons you think (again ,think of the ends of a slide). The obvious part is that **unsorted lists require searching the entire list, so maximum is found in `O(n)` time.** The other obvious connection should be that **sorted doubly-linked lists have access to both the first and last item so the maximum can be found in `O(1)` constant runtime**. So why is **maximum access found in constant time (`O(1)`) for sorted singly-linked lists**? When we add and delete items from a sorted singly-linked lists, we can use that as an opportunity to update a property of the `last` node in the list. By paying that storage cost in those functions, it allows us to have instant access to the last node, which in the case of a sorted list, also happens to be the maximum value as well. Wow! That's a big _list_ of problems to solve! **As a thought exercise, see if you can't extend this table to include runtimes for sorted and unsorted arrays as well.** ## Thinking about dictionaries **Think of a data structure dictionary as the eponym of the physical book dictionary is a good analogy.** Now I'm going to give myself some credit here because I think this is pretty clever: I purposefully chose a fancy word like _eponym_ because if you don't know what that means, YOU'RE GOING TO LOOK IT UP IN A DICTIONARY. And when you look up that word at dictionary.com or whatever your favorite word site is, how do you search for the word? _By its name_. **Dictionaries operate exactly the same way as dictionary books do because searching for content happens by looking up the actual name of the content itself.** Now you know how a `search` method would be implemented. What about `insert` and `delete`? Well how is a word and its definition laid out in a dictionary that you read? > **eponym** noun _ep·onym | \ ˈe-pə-ˌnim_ : one for whom or which something is or is believed to be named The name of the word is the **key** that you've looked up, and the definition of the word is the **value**. So whenever you want to add or remove a word in a dictionary, you _see if the key is available, and if it is, you add/remove the definition_. As we saw from the Daily Problem, there are some additional (optional) methods you could add to a dictionary as well, such as `min`/`max` for quickly moving to the beginning/end of the dictionary, and `successor`/`predecessor` for moving forward/backward between keys/words (not necessarily forward/backword in memory), very similar to the `next` method we implemented for a linked list. We went into these pretty in-depth at the beginning, so suffice it to say we've thoroughly covered the 7 major methods for implementing dictionaries. ```js let dictionary = {}; // insert dictionary["eponym"] = "one for whom or which something is or is believed to be named"; // search dictionary["eponym"]; // => "one for whom or which something is or is believed to be named" // delete dictionary["eponym"] = undefined; // no different than dictionary["apple"], both are undefined ``` ## Improving dictionary methods with Binary Search Trees **Binary Search Trees (BST)** are a clever data structure that implements all 7 dictionary methods and takes all of the best parts of the dictionary runtimes. The result is a data structure that efficiently supports all of these functions. BSTs consist of a _root_ or start node and trickle down into two other _child_ nodes: the _left_ and the _right_ child. The _left_ node is always less (in value) than the current root node, and the _right_ node is always greater (in valud) than the current root node. We can also have an optional _parent_ node reference which tracks in the opposite direction. This organization is what allows us to quickly and easily find any value in the tree. **The best way to think of BSTs is to imagine the [Lady Justice](https://en.wikipedia.org/wiki/Lady_Justice) holding her balance scale with a heavier right-handed side.** Why the right side? Since the right node is the larger item, if you imagine the tree holding numeric values, and those values represented the mass of the object on the scale, you know that the higher-valued item will weigh more: ``` 6 / \ 4 \ 8 ``` If that right there is her scale, then you can clearly see that 8 weighs more than 4, so the scale will tip to the right in favor of the item that weighs more. An elaborate example? Maybe. An example you will remember? _Definitely._ ```js class Node { constructor(value) { this.value = value; this.left = null; this.right = null; this.parent = null; // optional } } class BinarySearchTree { constructor() { this.root = null; } search(item) { let curr = this.root; while (item.value !== curr.value) { if (item.value < curr.value) { curr = curr.left; } else if (item.value > curr.value) { curr = curr.right; } } return curr; } insert(item) { let curr = this.root; if (!curr) { this.root = item; return true; } while(curr) { if (item.value < curr.value) { if (curr.left) { curr = curr.left; } else { curr.left = item; break; } } else if (item.value > curr.value) { if (curr.right) { curr = curr.right; } else { curr.right = item; break; } } } return true; } maximum() { let max = this.root; if (!max.value) return null; while (max.right) { max = max.right; } return max.value; } minimum() { let min = this.root; if (!min.value) return null; while (min.left) { min = min.left; } return min.value; } preorderTraverse(item) { if (item.value) { console.log(item.value); traverse(item.left); traverse(item.right); } } inorderTraverse(item) { if (item.value) { traverse(item.left); console.log(item.value); traverse(item.right); } } postorderTraverse(item) { if (item.value) { traverse(item.left); traverse(item.right); console.log(item.value); } } traverseTree() { inorderTraverse(this.root); } } ``` **Question: Given our practice with Big O Notation, can you figure out the runtime of the methods within this sample `BinarySearchTree` class?** **Extra Credit: We never filled in the `delete(item)` method for BSTs. To be honest, they're kind of tricky. How do you think you would implement one?** Would love to see your answers and sample gists [on Twitter](https://twitter.com/theadamconrad). ## Balanced BSTs and the next Daily Problem While BSTs are very good data structures most of the time, they can still be improved. **_Balanced_ BSTs will automatically rebalance nodes whenever new items are inserted or deleted into the tree.** For example, if you keep adding the natural numbers (1,2,3,4,5…) to a BST, it will be effectively the same thing as a linked list, with the height of the unbalanced tree _`h`_ equal to the length of the items _`n`_: ``` 1 \ 2 \ 3 \ 4 \ 5 ``` But if you _balance_ the tree so that we try to maximize as many _left_ and _right_ nodes being used as possible, the end result of a rebalance would look like this: ``` 3 / \ 2 4 / \ 1 5 ``` This time, the height is halved at every level, and we now know that if we are successively halving something, it's happening in a logarithmic fashion. Therefore **the height of the balanced BST _`h`_ is equal to _`log n`_**. To get some more practice dealing with this concept for balanced BSTs, the Daily Problem focuses on them: > You are given the task of reading in `n` numbers and then printing them out in sorted order. Suppose you have access to a balanced dictionary data structure, which supports each of the operations search, insert, delete, minimum, maximum, successor, and predecessor in `O(log n)` time. > > 1. Explain how you can use this dictionary to sort in `O(n log n)` time using only the following abstract operations: minimum, successor, insert, search. > > 2. Explain how you can use this dictionary to sort in `O(n log n)` time using only the following abstract operations: minimum, insert, delete, search. > > 3. Explain how you can use this dictionary to sort in `O(n log n)` time using only the following abstract operations: insert and in-order traversal Think you've got the answers? We'll find out in two days with the next article! And for more practice, check out these problems from the book: 1. 3-10 2. 3-11 ]]> 2018-09-11T10:28:00+00:00 Basic data structures in JavaScript https://www.adamconrad.dev/blog/basic-data-structures-in-js/ Thu, 06 Sep 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/basic-data-structures-in-js/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/how-to-analyze-js-functions-with-big-o/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Answers to the Daily Problem (2-20) So this one was honestly a bit confusing to me. The question reads: > Find two functions `f(n)` and `g(n)` that satisfy the following relationship. If no such `f` and `g` exist, write “None.” The problems are odd because we don't really deal with Little O, which stands for "grows strictly smaller than." Little O is like a stricter version of Big O. This just isn't something you'd have to worry about in the professional world, either as a working developer or in an interview setting. It's also not a topic that is covered very much in an algorithms book. But let's give it a try anyway. > (a) f(n) = o(g(n)) and f(n) != Θ(g(n)) So this is saying `g(n)` should grow at a smaller rate than `f(n)` should, _and_ should not be the average-case, meaning it can't satisfy both `O(g(n))` or `Ω(g(n))`. Basically, we want an f(n) that is not as bad as g(n), so a simple one would be **`f(n) = n + 1, g(n) = n^2 + 1`**. > (b) f(n) = Θ(g(n)) and f(n) = o(g(n)) This one should be **none**, and the reason why is that `Θ` is all about being tightly-bound, whereas the little letters (Little O and Little Omega) are all about being loosely-bound. You can't have both, so the answer is none. > (c) f(n) = Θ(g(n)) and f(n) != O(g(n)) This is also **none**, because the definition of `Θ(g(n))` assumes that you satisfy both the Big O and the Big Omega. > (d) f(n) = Ω(g(n)) and f(n) != O(g(n)) This one should be fairly straightforward. We want something that satisfies Big Omega and not Big O, meaning we don't want a Big Theta set of functions. **`f(n) = n^2 + 1, g(n) = n + 1`** should satisfy this (essentially, the inverse of problem A). ## Data structures as blocks of memory One way to group data structures is by how they allocate memory. I will admit, this was not an intuitive way for me to think about elementary data structures, but it's a good thing to keep in mind. There are two major forms of data structures by memory allocation: ### 1. Continuous Continuous-memory data structures are those that are linked as one block of memory. So when a computer allocates memory in RAM for this structure, it will allocate one block so it can be accessed as one block. The computer won't have to use pointers to find chunks of addresses in order to make sense of all of the data for this one structure. There are lots of data structures of this type we'll cover later, but the only fundamental continuous data structure you need to be aware of is the **array**. #### Arrays Arrays are _the_ fundamental continuous data structure that everyone knows. In JavaScript, they have a `typeof === 'object'` but you can distinguish them with `[] instanceof(Array)`. **Arrays are like [cubbies](https://www.google.com/search?q=cubbies&source=lnms&tbm=isch&sa=X&ved=0ahUKEwj8qP_r1O3dAhWwTt8KHViDD1wQ_AUIDygC) at school.** A cubby is one big piece of furniture with slots to store things like hats, coats, and backpacks. Like arrays, they store stuff, and they are one object. And like in JavaScript, cubbies can store anything that fits, we don't have any strict types that dictate how our arrays can be allocated like in statically-typed languages. ##### Advantages * **Constant-time access to elements.** If you provide the _index_, or numeric address inside of the array, we know exactly where the value of that item is without any sort of transformation. That makes array access extremely fast. * **Space efficiency.** If you only have to take up one block of memory, that's more efficient than if you have to take up lots of discontinuous, fragmented chunks of memory. * **Colocality.** Building on the previous benefit, because it's one block of memory, that also means that all of the values of the array are located next to each other, making access via traversal also very easy. This allows computers to make use of their caches since we can easily plop the whole array onto (and off) of the cache. ##### Disadvantages * **Fixed size.** The space efficiency comes at a cost that we can't modify the size of the array later on. If we allocate for 5 items, that's it. There are two ways to mitigate this: _buffer space_ and _dynamic arrays_. With buffer space, we can leave extra empty room for the array to grow. So let's say you allocate an array for 5 items, well we can buffer in 5 more for a total of 10 allowable items. The obvious downside here is that there is wasted space in anticipation for utilizing arrays for purposes we didn't intend. The other option is to use dynamic arrays. Dynamic arrays will allocate double the space of the current size of a filled array and copy its contents over to that newly-allocated block. It will then free up the old space that was allocated for the original array. This is like a smart version of buffer space that only adds in the buffer when the array is filled. It still suffers from the same downside of potentially wasted space, plus the additional operational complexity of allocation and deallocation (`O(n)`). This cost is still `O(n)` because the amount of reallocation work is the convergence of a geometric series (i.e. we double the size half as often because the amount of slots required to fill double with each expansion) which in the end works out to `O(n)`. It also can't guarantee that access will be in constant-time, since the boundary conditions for accessing elements that double the array will now take `O(n)` time. ```js var myArray = [1,2,3]; ``` ### 2. Linked Linked data structures, as you might guess, differ from continuous data structures because they are distinct blocks of memory _linked_ together to form what looks like one cohesive structure. The linking mechanism is called a _pointer_. A pointer acts as a reference to an address in memory. So while your house is at 742 Evergreen Terrace in Springfield, it's the Evergreen Terrace part that serves as the pointer to your house. In programming, **we make heavy use pointers by passing the value of the pointer around in languages like C and C++.** In JavaScript, however, no such functionality exists. Instead, **we pass objects by reference in JavaScript**. That means we make a copy of the object as a reference to the original object we want to find, rather than the address we want to use to find it in memory. In this article we'll cover several of these kinds of fundamental linked data structures, including: #### Linked lists I have memories/nightmares of my very early college days doing phone screens for my internships with questions like: > Tell me how to reverse a singly-linked list in place. It's funny, looking back now it actually _is_ a good question to ask a sophomore in college because he has no real professional experience to lean on to reply with something snarky like: > Why would I know how to do that? My list library already has the optimal algorithm. Anyway back to the task at hand. I like to **think of linked lists like [those colorful play slides](https://sc02.alicdn.com/kf/HTB1VYtyHpXXXXX3aFXXq6xXFXXXK/200823934/HTB1VYtyHpXXXXX3aFXXq6xXFXXXK.jpg) you see on playgrounds**. There's one way to enter: at the top (or the start of the list). You can't see what's inside the slide/list until you slide down/traverse the list. And each piece of the slide/list is a different color/value. And the only way to get from one section/item to the next is to slide down the slide/iterate to the next time. And for doubly-linked lists, those are like the crazy kids who would get to the bottom of the slide and dare to climb back up inside to the top. Sure, they're kind of crazy and mildly stupid (the climbers, not the lists), but the reward for the extra work (of climbing or implementing the extra functionality) is increased flexibility and exploration. Now obviously there is more of an implementation cost to doubly-link a list, but this cost is negligible, so it's basically always worth implementing. ##### Advantages * **Memory is dynamically allocated.** Each time we add a `LinkedList` item to the end of the list, we allocate some more memory. No more and no less than exactly what we need. This is advantageous to arrays because of that space efficiency. * **Insertion and deletion is simpler.** Since list manipulation operates on the ends of the list, which we have direct access to, these operations are extremely simple, as in, constant-time (`O(1)`) simple . * **Sorting/rearranging lists is simpler, too.** Since the chain between list items is just a pointer to the next item, we don't have to worry about physically changing the memory space to rearrange the list. All we have to change is who is pointing to who. This is particularly beneficial for really large objects because those objects do not have to move. Moving and finding really large blocks of memory is far more difficult and costly than moving pointers which are of a very tiny fixed size. ##### Disadvantages * **More overall space.** You not only need space to store your value, but also to point to your other list items. * **Worse colocality and caching performance.** Since there is no requirement that all of your `LinkedList` items have to be part of the same memory block, it's much more difficult to cache since you could theoretically have to search all of your RAM to find the memory spaces to throw the list into the cache. * **Random access is slower.** You can't just pluck an item with an index. Remember, this is like a slide. The only way in or out is from the ends, there is a barrier around the middle items of the list. Only from the ends can you make your way to the section of the list that you desire. ```js class Node { constructor(value) { this.value = value; this.next = null; } } class LinkedList { constructor() { this.head = null; } delete(item) { let prev, curr = this.head; while (curr.value !== value) { prev = curr; curr = curr.next; } prev.next = curr.next; curr = null; } search(value) { let curr = this.head; while (curr.value !== value) { curr = curr.next; } return curr; } insert(item) { item.next = this.head; this.head = item; } } ``` #### Stacks and queues Once you've grasped linked lists, stacks and queues are pretty easy to understand. **Stacks and queues are just specialized linked lists.** Technically, both of these can be implemented using a linked list _or_ an array. It is arguably easier to make a queue with a linked list and a stack as an array because of the way the insertion and deletion functions are implemented respectively for arrays and lists. ##### Stack **A stack is exactly what you think it looks like.** A stack of pancakes. A stack of cards. A stack of whatever. When you throw pancakes onto a plate, what's the pancake you see at the top? The _last_ one you put on the plate. When you start eating the pancakes, what's the first one you dig your fork into? _Also the last pancake_. This actually has a name: **Last-In-First-Out, or LIFO**. In the jargon of computer science, really the only difference with a stack and regular old linked list is `insert` is now called `push`, and `delete` is now called `pop`. In fact, you may recognize that because [those are methods on Array.prototype](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array#Methods_2). That's right, you could actually build a very simple `Stack` class by limiting the scope of an `Array`'s modification to just `push()` and `pop()`. Why would you need one over an array? Stacks are an ideal data structure for handling depth-first search. Just like you dig deeper into that stack of pancakes, so should you think about using stacks with DFS. We'll get more into search algorithms in a later article. ##### Queue **A queue is also an aptly-named structure for the thing you do when you wait in line at the deli or the DMV.** What's that process like? You take a ticket, it has some number on it, and you wait until your number is called. When you are waiting in line, how do they process the orders? From the lowest ticket number to the highest, or in other words, the time people entered the queue. This, too, has a name: **First-In-First-Out, or FIFO**. Again, this is just a specialized linked list with different names for `insert` and `delete`. Now they are called `enqueue` and `dequeue`. JavaScript has [these methods]([those are functions on Array.prototype](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array#Methods_2) as well, but they are called `push` and `unshift` (or `shift` and `pop`, depending on how you look at it). No, that's not a typo, `push` is used twice on purpose. With queues, think of it like the tickets: when a number is called from the deli, we're taking the oldest ticket with the lowest number, that's at the _front_ of the queue, or the beginning of the list. When the list was empty, we added to the end of it. As more people are added to the list, we keep adding them to the end of the queue. But when the person behind the deli counter calls our number, where are we in line? _Right in front_, so we don't remove from the end of the list, _we remove from the beginning_. Why would you need one over a linkedt list? Queues are an ideal data structure for handling breadth-first search. Just like people shuffle across the floor when waiting at the deli counter, so should you think about using queues with BFS. Again, we'll save search algorithms for later. Okay, I think that's enough for now. Next article we'll cover dictionaries, but right this moment we'll move onto the Daily Problem. ## The next Daily Problem This one is not in the book, so I'll just transcribe it here: > For each of the four types of linked lists (singly unsorted, singly sorted, doubly unsorted, doubly sorted), what is the asymptotic worst-case running time for each dynamic-set operation listed? > > a) Search(List, item) > > b) Insert(List, item) > > c) Delete(List, item) > > d) Successor(List, item) > > e) Predecessor(List, item) > > f) Minimum(List) > > g) Maximum(List) Pretty straightforward: for the functions that operate on a linked list, give the Big O time complexity. If you've followed along with this and the previous articles then this question should take very little time to answer. And for more practice, check out these problems from the book: 1. 3-2 2. 3-4 ]]> 2018-09-06T10:28:00+00:00 How to analyze JavaScript functions with Big O https://www.adamconrad.dev/blog/how-to-analyze-js-functions-with-big-o/ Tue, 04 Sep 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/how-to-analyze-js-functions-with-big-o/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/big-o-notation/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Answers to the Daily Problem (2-22) So for our question today, we're trying to decide which of the asymptotic notations we're working with when comparing two functions: > For each of the following pairs of functions `f(n)` and `g(n)`, state whether `f(n) = O(g(n))`, `f(n) = Ω(g(n))`, `f(n) = Θ(g(n))`, or none of the above. It's just a fancy way of saying, "we have a function `f(n)`, is its `g(n)` the best-case, worst-case, or average-case runtime for our function?" Let's take a look: > (a) `f(n) = n^2 + 3n + 4, g(n)= 6n + 7` I'm going to offer you the trick I used. As was mentioned previously, we can reduce the `g(n)` down to just the highest-powered variable without the constants. So that means I'm really just looking at whether `n^2 + 3n + 4` is compared to `O(n)`, `Ω(n)`, or `Θ(n)` for this to realistically apply. For `f(n)`, the worst-case can't be any lower than `n^2`, so that's out. We also know that if worst-case can't work, then neither can the tight-bound theta (since theta requires that _both_ worst and best case can apply). Therefore, we do know the best-case can be no better than `n` given `g(n)` has an `n` in it, so we can safely say that **`f(n) = Ω(g(n))`** is the correct answer here. > (b) `f(n) = n√n, g(n) = n^2 − n` This one is a bit trickier, but we can use a neat trick for finding the powers for comparison. `√n` is really the same thing as `n^0.5`, and when we multiply that with `n`, we're really just saying that `f(n) = n^1.5`. So now we're really just comparing `n^1.5` to `n^2`, which means that `f(n)` clearly has the lower order variable compared to `g(n)`, so **`f(n) = O(g(n))`** is the correct answer. > (c) `f(n) = 2^n − n^2, g(n) = n^4 + n^2` Applying the trick here again, we have an exponential function on the left, and a quadratic function on the right. Quadratic functions are lower on the [dominance relation](http://bigocheatsheet.com/), so we can that **`f(n) = Ω(g(n))`** is the correct answer here as well. ## Big O Analysis by example Now that we've gone over some hypothetic mappings of calculating computational operations, let's actually apply this to real-world algorithms in JavaScript. The most applicable form of Big O Notation in a professional setting would be if you were asked to evaluate the time complexity of the algorithm you've just written on the whiteboard/typed on your computer. In the following examples, we'll provide the JavaScript code for some algorithm, and then we'll do a Big O analysis line-by-line to show you how we can provide an `f(n) = O(g(n))`. ### Sorting items in an array with Selection Sort Selection sort is an algorithm that sorts an array by _selecting_ the lowest remaining value and putting it at the end of the sorted portion (the sorted portion is allocated to the front of the array). It looks something like this: ```js function selectionSort(arr) { let min; const arrLength = arr.length; for (let i = 0; i < arrLength; i++) { min = i; for (let j = i + 1; j < arrLength; j++) { if (arr[j] < arr[min]) min = j; } [arr[i], arr[min]] = [arr[min], arr[i]]; } return arr; } ``` If the last line before the brackets seems weird to you, this is how you can [swap variables using ES6 destructuring](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Destructuring_assignment#Swapping_variables). This is also a great way to illustrate the appropriate uses of `let` and `const`. `let` makes sense for variables like `min`, `i`, and `j` because they are being modified over and over again. `const` makes sense for `arrLength` because the length of the array never needs to change, so we can keep it constant. So how do we know how much this will cost in terms of Big O time complexity? Remember the 3 rules from the previous article: > 1. **Add 1 for simple operations**. > > 2. **Add `n` for looping over a set**. > > 3. **Whatever your calculations were for 1 and 2, multiply them if the algorithm is called again recursively**. So let's go through this line-by-line, ignoring lines for whitespace and closing brackets because those are just syntax and are not evaluated: The first line is the function declaration, and while that requires computation, we're evaluating what is _inside_ of the function, so we can skip this line. The next two lines are variable declarations, which apply rule 1, and take a constant amount of time (this includes the `length` function on the array). So let's call our total `2` right now. Now we've hit our first `for` loop. How many elements are we iterating over? From zero up to the length of the array, which we can mark as `n` since that's been the variable we've stuck with for all of our Big O calculations thus far. This brings our total up to `n + 2`. The next line brings another variable assignment, bringing our total up to `n + 3`. Now we've hit another `for` loop. But this time it's _inside_ the other `for` loop. How would that work? Imagine you have 5 cards in your hand, each numbered 1 through 5. Take the first card and add it up against every other card, including itself. How many additions have you made? It should be 5. Now do that for the second card, adding up the other four cards plus itself. The number of additions? Still 6. If you keep repeating this, each card has interacted with each other card, for a total of 25 operations. 25 = 5*5, so if we generalize this for `n` cards, the total is `n*n = n^2`. Therefore, our nested `for` loop is _multiplying_, not adding onto our outer `for` loop. Since our outer `for` loop thus far has a variable declaration and the loop itself, we're looking at a total of `(n-1)*(n+1) + 2`. Why `n-1`? Because we declare `j = i+1`, so the inner loop will only ever see, at most, 1 less than the entire array. Next, we have a bunch of things going on all places onto one line. We have a conditional statement, some variable access in the array, and finally a variable assignment (if the condition holds). This part gets a bit tricky. We know that all of these items add a constant amount of cost in terms of computation. The problem is that with a conditional assignment (like with our `if` statement) the chance of making the condition based on the sum of all of the remaining items in the array to evaluate. That sum is equal to `(n-1) + (n-2) + ... + 2 + 1`, which is approximately equal to `n(n-1) / 2`, bringing the total to `(n-1)*(n+1) + n(n-1)/2 + 2`. Next, we have our JavaScript destructuring which is acting as a sort of swap function. Since these are just assignments of two variables, we know this is just a constant addition to our operations total within the outer `for` loop, bringing us to `(n-1)*(n+3) + n(n-1)/2 + 2`. Finally, we return the array to display its newly-sorted self. Referencing a variable is just as costly as assigning one in terms of computation, and now that we're back into the base block of our function, we've completed our cost analysis with a grand total of `(n-1)*(n+3) + n(n-1)/2 + 3`. When we multiply everything out, we get `n^2 + 3n - n - 3 + .5n^2 - .5n`, which simplifies to `1.5n^2 + 1.5n - 3`. So coming back to the actual asymptotic analysis for O, if we say that `f(n) = 1.5n^2 + 1.5n - 3` and we want to do all of the same simplification techniques when evaluating this to be equal to `O(g(n))`, we're left with the largest power which is just **`O(n^2)`**. ### Matrix multiplication Did that feel like a lot of work? It should! Because with this next example, we're going to _drastically_ simplify our analysis, even though the algorithm is more complicated. Matrix multiplication means taking two 2D arrays and multiplying them together to get the multiplied value of all of the cells. If one 2D array is of dimensions `x,y` and the other is `y,z` (they have to have at least 1 dimension in common else you can't actually perform multiplication on the two), we return an array of dimension `x,z` using the following algorithm: ```js function multiplyMatrices(arr1, arr2) { const x = arr1[0].length; const y = arr1.length; const z = arr2.length; let matrix; if (y !== arr2[0].length) { throw new Error("Matrix cannot be multiplied"); } // initialize matrix for (let a = 0; a < x; a++) { matrix[a] = []; } for (let i = 1; i <= x; i++) { for (let j = 1; j <= z; j++) { matrix[i][j] = 0; for (let k = 1; k <= y; k++) { matrix[i][j] += arr1[i][k] * arr2[k][j]; } } } return matrix; } ``` Much more complex, but with this trick, we can calculate this very quickly: **scan for nesting and multiply**. The first few lines are constants, so worst-case now is `O(1)`. Our first `for` loop of length `x`, now we know worst-case is `O(n)`. We have a 2nd `for` loop where the multiplication starts, but since it's at the same nesting as the first one, we're still at a worst-case of `O(n)`. The next `for` loop is inside of the other one, and as we saw from earlier, those get multiplied so now our worst-case is `O(n^2)`. Finally, we have a 3rd `for` loop nested inside the other two, so multiplying this again we now have a **worst-case of `O(n^3)`**. We can skip the `return` statement as we know that's a constant, which we've already determined this far worse than constant time. Even with all of the initialization, error handling, and matrix math, by simply scanning the function for nesting, we were able to determine very quickly that the time complexity of this function is cubic. ## A bit on why logarithms are special Remember the example about continuously splitting a deck of cards in half until we reach our target card? This is an example of a logarithm function. **A logarithmic function is the inverse of an exponential function.** That means it tells us "how many times I need to double something to get to some number _n_". Conversely, it also tells us "how many times I need to cut something in half to get down 1." If you're dealing with a binary search (i.e. choose your own adventure where you either go left or right until you reach a dead end), or you're climbing down the levels of a tree until you hit the leaves, then you know you're dealing with a Big O of `O(log n)`. The base of the log (meaning base 2 if you have 2 options as in a binary search or base 3 if you have 3 options in a ternary search, and so on) doesn't matter. This is just like how the constant value of a quadratic function doesn't matter in big O analysis. All we care about is the fact that, in general, we're dealing with a logarithmic growth rate for our Big O analysis. ## The next Daily Problem (2-20) You made it! Big O analysis isn't so bad if you apply some quick scanning tricks. To practice today's work, here's the next Daily Problem: > Find two functions `f(n)` and `g(n)` that satisfy the following relationship. If no such `f` and `g` exist, write “None.” > > (a) f(n) = o(g(n)) and f(n) != Θ(g(n)) > > (b) f(n) = Θ(g(n)) and f(n) = o(g(n)) > > (c) f(n) = Θ(g(n)) and f(n) != O(g(n)) > > (d) f(n) = Ω(g(n)) and f(n) != O(g(n)) ## More practice If you're looking for more [deliberate practice](https://www.redgreencode.com/deliberate-practice-for-software-developers/) with these articles, the _Algorithms_ class we're following includes homework sets and the [first one](http://www3.cs.stonybrook.edu/~skiena/373/hw/hw1.pdf)) just came out today. Since I'm auditing this course (and so are you by definition if you're reading my articles instead of actually taking the course), I'll include the homework problems that are covered thus far. Once the assignment answer keys are out, only then will I spend any (if at all) time on my answers to the homework problems. The problems from the homework set that are covered from the first three articles include: 1. 1-17 2. 1-19 3. 1-20 4. 1-22 5. 2-7 6. 2-8 7. 2-19 8. 2-21 9. 2-23 10. 2-24 Happy practicing! ]]> 2018-09-04T10:28:00+00:00 What is Big O Notation? https://www.adamconrad.dev/blog/big-o-notation/ Thu, 30 Aug 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/big-o-notation/ This article is part of the [Data Structures and Algorithms Series](/tag/algorithms/) following the course [The Analysis of Algorithms](https://www3.cs.stonybrook.edu/~skiena/373/). If you missed the [previous article](/blog/how-to-design-an-algorithm/), check that out first as well as [a copy of _The Algorithm Design Manual_](https://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202). ## Answers to the Daily Problem (1-5) The question was: > The _knapsack problem_ is as follows: given a set of integers `S = {s1, s2, . . . , sn}`, and a target number `T`, find a subset of `S` which adds up exactly to `T`. For example, there exists a subset within `S = {1,2,5,9,10}` that adds up to `T = 22` but not `T = 23`. > Find counterexamples to each of the following algorithms for the knapsack problem. > That is, give an `S` and `T` where the algorithm does not find a solution which leaves the knapsack completely full, even though a full-knapsack solution exists. Here's the actual problem, with the answers (and explanations) inlined: > (a) Put the elements of `S` in the knapsack in left to right order if they fit, i.e. the first-fit algorithm. **`T = 23; S = {22,21,2}`**. Going from first to last, we add `22`, but at this point there are no other elements that will add up to `23` (even though there does exist a valid solution of `21` and `2`). Our counterexample succeeds! > (b) Put the elements of `S` in the knapsack from smallest to largest, i.e. the best-fit algorithm. **`T = 5; S = {1,2,3,5}`**. Going from smallest to largest, we add `1` and then `2`. At this point adding `3` or `5` will put us over `5`, even though obviously choosing just the number `5` would satisfy `T`. Countered again! > (c) Put the elements of `S` in the knapsack from largest to smallest. **`T = 23; S = {22,21,2}`**. Same answer as A, and it works here too since they both happen to be ordered from largest to smallest values within `S`. If any of that doesn't make sense, be sure to [let me know on Twitter](https://twitter.com/theadamconrad). ## Proving an algorithm's efficiency As we mentioned in the last article, the key to proving efficiency involves **Big O Notation**. What is that exactly? The _O_ specifically refers to the letter commonly used for the name of the function that analyzes an algorithm's _worst case_ (or _upper bound_) on the relative amount of time the algorithm will take to complete (there's also _Ω_ which refers to the _best case_ / _upper bound_ and _Θ_ which refers to the _average case_ / _tight bound_, but in an interview you're only ever going to be quizzed on the worst case, if at all). I say relative because our _O_ function doesn't return a number such as the number of seconds to complete our algorithm. Instead, it's _relative_ to the number of operations within our algorithm. ## How do we count the number of operations in an algorithm? We can count operations in an algorithm with 3 simple rules (based on how computer RAM evaluates the cost of computing power): 1. **Add 1 for simple operations**. Adding two numbers (`+`), branching conditionally (`if`), and variable assignment (`var x = 1`). 2. **Add `n` for looping over a set**. If you see `for` or `while` in an algorithm, it's over some `n` number of items, so in the worst case, you need to consider the posssibility that you have to iterate over all `n` items. 3. **Whatever your calculations were for 1 and 2, multiply them if the algorithm is called again recursively**. For example, to calculate the fibonacci sequence on `n` items, each recursive call of fibonacci requires finding the previous value of the fibonacci sequence. If each call requires twice the number of calls, that's exponential (`2^n` specifically) number of calls to fibonacci! But here's the thing, calculating the _precise_ number of operations for an algorithm is unnecessarily cumbersome. Therefore, **when we refer to _Big O Notation_, we're calculating the time complexity of an algorithm with the multiplicative constants ignored and the lower factors ignored**. So if you found out your algorithm ran in precisely `2n^3 + 4n^2 + 14763n + 1209489` operations, our Big O analysis can reduce this function all the way down to `O(n^3)`. ## Patterns for identifying algorithmic efficiency So now that we know how to evaluate an algorithm's Big O worst case performance, what do we make of it's efficiency? Let's look at each of the major patterns you'll see for your Big O analysis: ### Constant - O(1) These are the fastest algorithms because they are comprised of simple operations. Your classic addition function, `function add(x,y) { return x + y }` runs in constant time, it's just the `+` operation! In the real world, these can run on any machine, slow or fast, with essentially any kind of input (provided it actually works), as these will always be able to run efficiently. ### Logarithmic - O(log n) Imagine you have a deck of cards. Cut the deck in half. Now cut that half of a deck in half. Now cut that portion in half. Now cut _that_ portion… okay you get the idea. If you plotted that on a graph, it would look just like the [`log(n)`](https://en.wikipedia.org/wiki/Logarithm). If you have an algorithm that is dividing the problem space by some number (like by 2 if you're doing a _binary_ search on a tree), you can bet that algorithm runs in logarithmic time. Obviously this function is now dependent on `n` so it's going to be slower than constant-time functions, but these are so fast on modern machines that you can basically feed it any `n` number of items and it will run just fine. ### Linear - O(n) Loop over each item in an integer array and print out the number. How many items do you have to print? For an array of 3 numbers, you have to count 3. For 5, it's 5. For `n` it's, well, `n`! This is why we say loops like `for` and `while` evaluate in `O(n)` since we have to touch, at worst, all `n` items. These algorithms still run pretty fast. Given that CPUs can now run millions of operations per second, it's safe to say a `n` of billions of items is still pretty fast in the real world. ### Superlinear - O(n log n) How would you get an algorithm like this? Again, imagine a deck of cards. I want you to find a card in the deck quickly. Even worse: I shuffled the deck. But you can find it pretty quickly if you follow these steps: Cut the deck in half until you just have pairs of cards. Sort the cards for each of the pairs (pretty easy to compare if one card is smaller than the other, right?). Now that all of the pairs are set, combine two pairs and repeat the process. Eventually you're going to have the whole deck sorted to easily find your card. Continuously cutting a deck in half sounds very much like our example where the answer was `O(log n)`. Looking at the whole deck? Sounds like `O(n)`. And given that we have to look at all of the cards for each _layer_ that we have to construct, that's `n` number of times we have to make `log n` partitions. Hence, `n log n`! Again, since this is just slightly larger than a linear algorithm, we can run an algorithm like this very quickly on even billions of items. ### Quadratic - O(n^2) You've got 5 numbers. You need to multiple each number by every number (including itself). So the first number is multiplied 5 times, the second number is multiplied 5 times, the third is too, and the fourth, and the fifth. 5 numbers multiplied 5 times is 25, which is the same is 5^2. So if we generalize this for `n` numbers to multiply by every other number, you've got your `O(n^2)` algorithm. This is where time needs to start being considered. For tens of thousands of items, we're still moving pretty quickly with generally under a billion operations. Once you get above a million items for this kind of algorithm, we're looking at _trillions_ of total operations to be run. Even your MacBook Pro can't count all of Apple's money in an instant; it's a pretty big number! ### Exponential - O(2^n) Remember the fibonacci sequence from earlier? A classic example of an exponential algorithm (hint: we can do _much_ better but the naive implementation is exponential). When you have to enumerate all subsets of `n` items, you're dealing with something exponentially painful. And we're talking _real_ painful. With quadratic algorithms we could work with a million numbers. But exponential? Just _40_ items gets us back up to over a trillion. _Oof_. ### Factorial - O(n!) Arrange a deck of cards from smallest to largest. That's one configuration. Now arrange it from largest to smallest. That's another configuration. Now shuffle the deck. Yet another configuration. Now organize that deck where it's from smallest to largest but order each card where hearts come first, diamonds come second, clubs come third, and spades come last in a given number. These configurations are called _permutations_ of the deck. And in fact finding the number of permutations for _n_ number of items takes `O(n!)` amount of time. So to find all permutations of a standard 52 deck of cards is `52!`. Do you just how _massive_ `52!` is? There are more permutations of a deck of cards than there are [_atoms_ on this earth](https://gizmodo.com/there-are-more-ways-to-arrange-a-deck-of-cards-than-ato-1553612843). You could shuffle a deck of cards every second _since the universe has started_ and still not see every permutation of a deck of cards. In other words, don't write factorial algorithms. Beyond basically 20 items, it's not only painful, it's essentially _impossible_. Modern computers simply can't handle that number of operations. So if you see your algorithm is pemutating over all of your items, and you have more than a handful of items, you should probably look for another solution to your problem. ## Wrapping up with the next Daily Problem So now that you understand about Big O Notation and the kinds of bounds you should be striving for, let's put that to the test: > For each of the following pairs of functions `f(n)` and `g(n)`, state whether `f(n) = O(g(n))`, `f(n) = Ω(g(n))`, `f(n) = Θ(g(n))`, or none of the above. > > (a) `f(n) = n^2 + 3n + 4, g(n)= 6n + 7` > > (b) `f(n) = n√n, g(n) = n^2 − n` > > (c) `f(n) = 2^n − n^2, g(n) = n^4 + n^2` Think you've got your answers? Good luck, and we'll see you next week! ]]> 2018-08-30T10:28:00+00:00 How to design an algorithm https://www.adamconrad.dev/blog/how-to-design-an-algorithm/ Tue, 28 Aug 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/how-to-design-an-algorithm/ The _knapsack problem_ is as follows: given a set of integers `S = {s1, s2, . . . , sn}`, and a target number `T`, find a subset of `S` which adds up exactly to `T`. For example, there exists a subset within `S = {1,2,5,9,10}` that adds up to `T = 22` but not `T = 23`. > > Find counterexamples to each of the following algorithms for the knapsack problem. > > That is, give an `S` and `T` where the algorithm does not find a solution which leaves the knapsack completely full, even though a full-knapsack solution exists. > > (a) Put the elements of `S` in the knapsack in left to right order if they fit, i.e. the first-fit algorithm. > > (b) Put the elements of `S` in the knapsack from smallest to largest, i.e. the best-fit algorithm. > > (c) Put the elements of `S` in the knapsack from largest to smallest. My plan is to solve the problem before the lecture, throw it up in the next article in the series to start off the article, see if it lines up with the answer given in the lecture, and then compare and contrast the different approaches as well as discuss any flaws or issues that arose along the way. ## See you next time! Wow, this is probably one of the longest posts I've ever written. I hope this post has convinced you to finally learn data structures and algorithms once and for all. Even if you aren't dedicated to taking the entire course, I hope this stands to be a reference guide if there are ever parts or pieces of this that you might want to reference or check when working with these things in JavaScript. ]]> 2018-08-28T10:28:00+00:00 Why hiring is broken and how I'm dealing with it https://www.adamconrad.dev/blog/why-hiring-is-broken-and-how-im-dealing-with-it/ Tue, 28 Aug 2018 10:28:00 +0000 https://www.adamconrad.dev/blog/why-hiring-is-broken-and-how-im-dealing-with-it/ 2018-08-28T10:28:00+00:00 Mastering the reduce function https://www.adamconrad.dev/blog/mastering-the-reduce-function/ Wed, 18 Jul 2018 01:27:00 +0000 https://www.adamconrad.dev/blog/mastering-the-reduce-function/ sum + n); ``` This is a perfectly valid example, but you might be wondering: How do you reduce an array of strings? _Why_ would you reduce an array of strings? What are some real-world examples of using reduce to help you in your day-to-day work? We're going to attempt to answer those in this article so you can master the reduce function and finally feel comfortable using it. ## Definition and breakdown Reduce's job is to take a collection of items and _reduce_ them down to one singular value. The [type signature](https://hackernoon.com/function-type-signatures-in-javascript-5c698c1e9801) of this function looks like this: ``` reduce: (callback: function, initialValue?: any) => any callback: (accumulator: any, currentValue: any, currentIndex?: number, array?: array) => any ``` We are using [TypeScript's function type annotation syntax](http://2ality.com/2018/04/type-notation-typescript.html#function-types) for this example. In short, it just helps decode the inputs and the outputs. So our function takes in two arguments we've called **`callback`** and **`initialValue`**. What are those arguments? ### `callback` argument and its type signature **`callback`** is a `function` that takes four arguments itself: **`accumulator`**, **`currentValue`**, **`currentIndex`**, and **`array`**. **`accumulator`** is of the type of thing you're trying to make your `outputValue`. So if you're reducing an array of `number`s through addition, the `accumulator` is a `number`. It is _accumulating_, or growing, in size **not by the length of the array but by the "growth" of the eventual output value**. In fact, there is an inverted relationship by how the accumulator grows against how the array is reduced. This object is definitely the most confusing, so if this still doesn't make sense, we'll break this down in the later examples. **`currentValue`** is an item in the array you're trying to reduce. As you iterate over the array, your `currentValue` changes as you move from the first to the last element in that array. **`currentIndex`** is an _optional_ `number` argument that coincides with the `currentValue` by expressing that value's index. So as you are iterating over your initial array, the `currentIndex` will grow from `0` up to the `n-1` number of elements in your array. **`array`** is an _optional_ `Array` argument that is the entire input array you provided to the `reduce` function. This might seem like an odd thing to pass in but the key here is that the input array is **not modified**. Functional methods obey _immutability_, meaning they do not modify the array object they are provided. Therefore, regardless of what you do in your `reduce` method, you always have access to your input array because it is never changed throughout the execution of `reduce`. ### `initialValue` argument **`initialValue`** is an _optional_ argument that provides the initial value for `accumulator` to begin reducing. For example, if you wanted to reduce `[1,2,3]` into the summation of those values, you'd want the `initialValue` to be set to `0` to ensure you are adding against the same type (a `number`) and that this value does not pollute the values that are being manipulated by the array. If you do not set this argument, the `accumulator` will be set to the first value of the array. In the above example, that would be `1`. ### `outputValue` return object **`outputValue`** is the final form of the `accumulator` object once the entire `array` has been iterated. Unlike most array functional methods, which return arrays, this will return the type that you have built throughout the iteration. This is _not necessarily_ the same type as the values within your input array, as we will see below. ## Examples Now that we know what goes in and out of the `reduce` function, let's dive into a few examples, step-by-step, in increasing difficulty. ### Easy - sum elements of an array As we mentioned earlier, the ever-present example of reducing arrays is to sum a list of numbers through addition. It can be succinctly expressed like this: ```js [1, 2, 3].reduce((sum, n) => sum + n); ``` The easiest way to see how the iteration is working is to throw log statements at each step for our two input arguments: ```js [1,2,3].reduce((a, c) => { console.log(`accumulator: ${a}`); console.log(`currentValue: ${c}`); return a + c; }); // FIRST ITERATION // accumulator: 1 // currentValue: 2 // SECOND ITERATION // accumulator: 3 // currentValue: 3 // FINAL ITERATION (AS RETURN VALUE) // 6 ``` On the first iteration, we grab the first element in our array, `1`, and set that as our `accumulator` value. We then add our `currentValue`, which is the second element in the array, `2`, onto our `accumulator`, because that is the logic we have decided on for our `callback` function. 1+2 equals 3, so `3` therefore becomes our new `accumulator` value in our second iteration. Since no `initialValue` is set, our second iteration is actually operating on the third and final element, `3`, which is added on to our new `accumulator` value of `3`. 3+3 equals 6, and with no elements left to iterate on in our array, we return the final `accumulator` value of `6`. Easy! Now, what happens if we operate on another kind of type, like a `string`? ### Medium - simple string builder One of the most common things I see in JavaScript code involves building strings like this: ```js const myBuddies = ['James', 'Darien', 'Eric']; let greeting = 'Greetings friends! '; for (let i = 0, l = myBuddies.length; i < l; i++) { greeting += myBuddies[i] + ' is my buddy. '; } greeting += 'These are all of my buddies!'; return greeting; ``` You start with an array and an empty string. You build the string up from the array elements, maybe adding in some additional information, and then you return the new string, fully formed and populated. Now look what happens when I rearrange and rename some variables, do you see the connection? ```js const array = [some, array, values, ...rest]; const initialValue = 'Greetings friends! '; let accumulator = typeof initialValue === "undefined" ? initialValue : array[0]; const manipulation = (a, b) => a.concat(b); let i = (typeof initialValue === "undefined" ? 0 : 1); let l = arr.length; for (i, l; i < l: i++) { accumulator = manipulation(accumulator, array[i]); } return accumulator; ``` This iterative pattern _is the `reduce` function_. The `callback` within `reduce` is simply some function of `manipulation` of data, iterated across an `array` and saved into some built value called an `accumulator`! The difference is, our last example is 8 lines of code, and to make this in `reduce`, we can express it in a very clean and terse way, with zero local variables needed: ```js ['James', 'Darien', 'Eric'] .reduce((paragraph, name) => { return `${paragraph}${name} is my buddy. `; }, 'Greetings friends! ') .concat('These are all of my buddies!'); ``` This is pretty neat. We were able to remove all local variables and reduce the line count by more than half. We are also able to chain functions together to continue to tack on unique statements at the end of our string builder. While this instance is a bit more involved and doesn't simply manipulate numbers, at the end of the day, this is still a form of summation, which may seem a bit too simplistic and similar to the first example. Lucky for you, we've got one last example that should be a bit more advanced and is not simply a form of addition. ### Hard - flatten array to dictionary Our last example is not going to simply be some form of summation. Instead, we're going to build a dictionary (also known as a hash table, which is simply an `Object` in JavaScript) from an array of arrays. We're going to make the first value in our sub-array be the keys, and the rest of our values in our sub-array be the values. If there are hash collisions (meaning multiple sub-arrays with the same value in the first element), we simply add the remaining sub-array items onto the existing key/value pairing. How are we possibly going to do this with `reduce`? We know our initial value is `{}`. We aren't doing a summation (or concatenation if we're dealing with strings), but we are still trying to manipulate our output value, which we know to be an `Object` as a hash table. How can we add things onto a hash table in JavaScript? Square bracket notation! That should be enough to get us what we want: ```js const arrayOfArrays = [ ['a', 'ace', 'apple', 'axe'], ['b', 'baseball', 'boy', 'bye'], ['c', 'cat', 'cold'], ['a', 'azure'], ['a', 'azure'] ]; const dictionary = arrayOfArrays.reduce((dict, page) => { const [letter, ...words] = page; dict[letter] = dict.hasOwnProperty(letter) ? dict[letter].concat( words.filter(word => !dict[letter].includes(word)) ) : words; return dict; }, {}); dictionary['a'][0]; // 'ace' dictionary['a'][dictionary['a'].length - 1]; // 'azure' dictionary['a'].filter(word => word === 'azure').length // 1 ``` Lots to unpack here. To start, we created our multi-dimensional array called `arrayOfArrays`. Each row in the array starts with the letter of our dictionary. That letter will represent the various pages in our dictionary. You could do this without declaring the local variable, but for the sake of readability within this blog, I chose to separate them. Next, we construct our actual dictionary. We start with our initial `Object`, a simple empty `{}` hash. Our `callback` represents two things, the `dict` dictionary object that we are constructing, and the `page`, which represents a row in our `arrayOfArrays`. That row has two kinds of strings: the letter representing what class of words we're on, and those words listed on that page. In languages like Haskell, lists come with functions called `head` and `tail` (also known as `car` and `cdr` in older languages like Lisp). In JavaScript, these don't come out of the box, but thankfully in ES6 we can quickly grab these through the destructuring pattern `[first, ...rest] = array`. We do exactly this to grab our `letter` and corresponding `words`. Next, we have to build our dictionary. There are two major considerations here: new pages and existing pages. If we have a new page, we are dealing with the simpler case. All we have to do is populate the page (whose key is `letter`) with our `words`. If we are dealing with an existing page, we want to add onto that existing page using `concat`. But we can't simply concatenate our list of `words`. What if we've already added a word to our dictionary? That's where our functional cousin `filter` comes in to filter out words that are already included on this page. The filter will return only novel `words`, removing duplicates along the way. Finally, we test out our new `dictionary` variable to prove our dictionary is complete, can handle adding words late to our prepopulated keys, and will gracefully handle duplicates. ## Final thoughts Congratulations! You now know how to reduce an array down to a composite value. You've not only mastered the reduce function in JavaScript but in every other language. As we mentioned in the beginning, the concepts behind `reduce` are the same across languages like Clojure and Haskell, but with different names like `fold` or `inject`. You can also now reason about variants on `reduce`, such as `foldr` (i.e. fold right, which is the same as `reduce`), or `foldl` (i.e. fold left, similar to `reduce` except the iteration happens from the back to the front of the array). If you're still confused, I've failed you. But fear not; find me on Twitter and I'd be happy to provide even more examples. I will not rest until everyone reading this knows the power of `reduce`! ]]> 2018-07-18T01:27:00+00:00 How to utilize VR on the web with A-Frame https://www.adamconrad.dev/blog/how-to-utilize-vr-on-the-web-with-a-frame/ Fri, 29 Jun 2018 09:25:00 +0000 https://www.adamconrad.dev/blog/how-to-utilize-vr-on-the-web-with-a-frame/ tag of a standard HTML file and you’re ready to get started! The entire framework is contained within this minified file, just like how you would program a web app with jQuery or React. If you’re wondering why you insert this script in the top of the file, it’s because the VR environment cannot be loaded without the framework. With most JavaScript source files, it’s beneficial to load the DOM first to prevent the scripts from blocking the content you want to show. In this case, the framework creates its own VR DOM and must be loaded to make use of any of the tags you need to create a VR application on the web. ## Hello (Virtual) World ![](https://cdn-images-1.medium.com/max/2000/0*XKMbJv5AZpGWusxi) So how do we work with A-Frame? It’s nearly as easy to build an A-Frame application as it is to install it: Yup, that’s really all there is to it. Just insert that into the tag of that same HTML file and you have a WebVR application! You can view our extremely simple demo app [here](https://adamconrad.dev/aframe-demo.html) with a mobile phone using Google Chrome. Impressed yet? Here’s how it works: The is like the tag or for the WebVR world. It is the container that isolates all of the 3D components and acts as a signal to the device that it needs to render this webpage in VR mode. From there, you can insert a whole set of components called primitives. For this app, we inserted a simple [primitive](https://aframe.io/docs/0.8.0/primitives/a-text.html). We add the properties of our primitive just like our standard HTML tags, using inline attributes. As you might guess, the value attribute tells A-Frame what text to render into the primitive. Since we are operating in a 3D world, we need to project this text onto something. Now that we are no longer dealing with a static surface like the tag, we need to tell A-Frame how we want to interact with this text primitive. The geometry provides a way to interact with text with something like a cursor by stating how we are projecting the text. In this case, we want to project our text onto another primitive component, the , which is simply referred to as a subset of primitives using :plane. So how do all of these primitives work together? That’s where the Entity-Component-System architecture (ECS) comes in. ## A Brief Intro on the Entity-Component-System Architecture ![](https://cdn-images-1.medium.com/max/2000/0*K7UyFSMafpQI5FFj) ECS is the architecture of choice for designing games and 3D environments. This architecture is heavily utilized in Unity, the popular 3D engine gaming framework. A-Frame is an ECS framework built on top of three.js, with specific support for VR devices. So let’s break down each of these words to understand the architecture: **Entity** refers to house component objects in a scene. These are basically
s that are named based on the kind of component you are using. For example, is the entity that houses our “Hello world!” text. All A-Frame primitives are some form of entities that combine a subset of components. **Component **is the reusable chunk of data that gets added to entities to bring them to life. So if is the HTML entity we add to our site, the value and geometry components are like the CSS to turn that entity into something meaningful. **System** is the management system that ties components together across all of the entities they belong to within a scene. So if we again refer to the geometry component, the system for that component is what allows geometry to be shared by , , and primitives, controlling the shapes of the objects and how those shapes appear in space in relation to one another. Understanding ECS is vital to conceptualizing the A-Frame API because A-Frame is essentially a giant collection of starter entities and components organized by various systems. You can then use these systems to build more complex and engaging components or entities to make your own environments richer and more interactive. ## Next Steps We’re barely scratching the surface on what VR can do today on the web. Even though VR itself is a brand new and growing platform, truly immersive web experiences are already possible with browsers like [Supermedium](https://supermedium.com/). These VR-first browsers provide a platform to showcase the power of frameworks like A-Frame. If this article piqued your interest in really diving into WebVR and A-Frame, A-Frame provides a guided tutorial series called [A-Frame School](https://aframe.io/aframe-school/#/1) to walk you through building your first WebVR application. Once you think you’ve got something unique to share, there are a few places to promote your work. Supermedium offers a [directory](https://webvr.directory/) for compelling WebVR projects, as well as sites such as [WITHIN](https://vr.with.in/). VR right now is very similar to the early days of the web: fun, expressive, and weird. There’s no need to try to aim for professional or production quality. Rather, we’re in the nascent stage where expression and creativity thrive, pushing people to test the limits of what is capable while the technology is still being fleshed out. There’s no better time to get ahead of the curve with the most bleeding-edge user interface than in AR/VR on the web with A-Frame. *Originally published at [softwareengineeringdaily.com](https://softwareengineeringdaily.com/2018/06/29/how-to-get-started-in-vr-with-a-frame/) on June 29, 2018.* ]]> 2018-06-29T09:25:00+00:00 What you need to know about WCAG 2.1 https://www.adamconrad.dev/blog/what-you-need-to-know-about-wcag-21/ Wed, 27 Jun 2018 14:11:00 +0000 https://www.adamconrad.dev/blog/what-you-need-to-know-about-wcag-21/ 2.1.4 Character Key Shortcuts (A) * 2.5.1 Pointer Gestures (A) * 2.5.2 Pointer Cancellation (A) * 2.5.3 Label in Name (A) * 2.5.4 Motion Actuation (A) * 1.3.4 Orientation (AA) * 1.3.5 Identify Input Purpose (AA) * 1.4.10 Reflow (AA) * 1.4.11 Non-text Contrast (AA) * 1.4.12 Text Spacing (AA) * 1.4.13 Content on Hover or Focus (AA) * 4.1.3 Status Messages (AA) * 1.3.6 Identify Purpose (AAA) * 2.2.6 Timeouts (AAA) * 2.3.3 Animation from Interactions (AAA) * 2.5.5 Target Size (AAA) * 2.5.6 Concurrent Input Mechanisms (AAA) ## What do the different conformance levels (A, AA, AAA) mean? As seen in the WCAG 2.1 document, there are a large number of recommendations for accessibility. It would be unrealistic to expect site owners to adhere to them all, so **each success criterion is given a priority for how important it is to satisfy**. ![Accessibility is important](/assets/images/blur-codes-coding-577585.jpg) Level A criteria are the most important, followed by AA, and then finally AAA criteria. Each level builds on the other, so it is necessary to satisfy all of the Level A requirements first before trying to satisfy AA or AAA requirements. This is because the W3C [requires full conformance at each level to be recognized as compliant](https://www.w3.org/TR/UNDERSTANDING-WCAG20/conformance.html#uc-conf-req1-head). > For Level AA conformance, the Web page satisfies **all the Level A and Level AA Success Criteria**, or a Level AA conforming alternate version is provided. Each section starts with Level A requirements, and builds up to Level AAA, with each requirement adding onto the previous one. For example, Section 3.3 is on input assistance. 3.3.1 and 3.3.2 are Level A requirements. 3.3.3 builds on 3.3.1, which is Level AA. The highest level, 3.3.5 and 3.3.6, base success around much more involved criteria, and thus achieve a rating of Level AAA. ## What conformance levels do I need to pass? The obvious answer would be to aim for AAA compliance, which is the strictest and most complete standard. The Level AAA standard will create an environment that maximizes your audience reach by providing support for the widest audience of disabled persons. Practically speaking, **aim for Level AA conformance**. This is the highest realistic conformance level to achieve. The W3C recommends this level as well: > It is not recommended that Level AAA conformance be required as a general policy for entire sites because it is not possible to satisfy all Level AAA Success Criteria for some content. One example of Level AAA conformance that cannot be satisfied is [WCAG criterion 2.1.3 for keyboard shortcuts](https://www.w3.org/TR/WCAG21/#keyboard-no-exception). If your application is only available on mobile, such as the Uber app, there is no real keyboard, so it cannot be navigated by keyboard shortcuts, and will not pass this success metric. This, of course, is not a practical requirement because Uber is designed for use on mobile and should not have to try and cater their strategy towards a requirement like this. ![Keyboard events](/assets/images/keyboard-events.jpg) Now that we have an idea of what level we need to aim for, here are the new success criteria in the updated guidelines and how to pass them. ## New WCAG success criteria ### Level A: The bare minimum #### 2.1.4 Character Key Shortcuts (source) ##### What it means Keyboard shortcuts aren't just for power users like vim developers. They are also super useful for the blind, who have no reason for using a monitor or mouse. Therefore, it is a good idea to implement some basic keyboard shortcuts to navigate your site without the need for a mouse. ##### How to pass * **If you have shortcuts, make sure you can disable them** * **The shortcut can be mapped to a different key than you provided by default** * **The shortcut can only be used on a component on in focus.** You can't accidentally trigger the shortcut if it isn't actively on the part of the page that should receive that shortcut. #### 2.5.1 Pointer Gestures (source) ##### What it means Things like touch gestures have many ways of interacting. Single point gestures include tapping or clicking. Multipoint gestures include things like pinch-to-zoom. Path-based gestures include things like swiping or sliding. Your application should account for all of these. ![touch gestures need fallbacks as well](/assets/images/touch-interaction.jpg) ##### How to pass If you use multipoint or path-based gestures, the **actions that are triggered by those gestures should have a single point fallback**. In other words, if you use the swipe event to mark something as complete, you should also have a touch or click gesture to mark that item complete as well. #### 2.5.2 Pointer Cancellation (source) ##### What it means Actions can be triggered with, for example, a click. Those actions should be reversible. ##### How to pass * **Don't use the down-event** (e.g. `keydown` or `mousepress`) unless it is essential * **Show a popup or alert to undo the previous action** * **If you use a down-event, an equivalent up-event should be available to undo that action** #### 2.5.3 Label in Name (source) ##### What it means Images can be used inside of a label to describe content. The component housing those images should have a text fallback of that image content. ##### How to pass * **If you use an image to label something, the naming attribute should have the text of that image or a description of that image as a fallback.** [Accessible name and description mappings](https://www.w3.org/TR/accname-1.1/#accessible-name-and-description-mapping) include attributes like `alt` for `` tags, [captions, transcriptions, and descriptions](https://webaim.org/techniques/captions/) for ` 2018-06-27T14:11:00+00:00 How to find excellent refactoring opportunities https://www.adamconrad.dev/blog/how-to-find-excellent-refactoring-opportunities/ Thu, 14 Jun 2018 21:13:00 +0000 https://www.adamconrad.dev/blog/how-to-find-excellent-refactoring-opportunities/ # An example in refactoring Here's the (obfuscated) code I was working with to begin our journey: ```javascript const addEvents = (events, label, segment) => events.push([label, segment]); const main = () => { const [upcoming, past] = partition( arr, date => today.diff(date, 'day') <= 0 ); const events = [] if (upcoming.length > 0) { addEvents(events, 'upcoming stuff', upcoming); } if (past.length > 0) { const byMonth = groupBy(past, date => moment(date).format('YYYY-MM'); ); const months = Object.keys(byMonth).sort().reverse(); months.forEach((month, idx) => addEvents(events, `${idx} months ago`, month); } return events; } ``` This code is pretty simple - I've got two arrays of dates: upcoming dates, and past dates (including today). I just need to let the world know what's coming up and what is in the past, so it's not a simple push to an array because each chunk is split off by month (except for the future, which is just labeled future stuff), and each section is really a tuple of label and date. So what immediately got me tweakin' on this code? ## Single Responsibility Principle By now the SOLID acronym is engrained in my brain. If you don't know what it is, thoughtbot wrote a [great intro about what SOLID is](https://robots.thoughtbot.com/back-to-basics-solid) so definitely check that out. The most obvious one that people seem to remember is the first one, Single Responsibility Principle. It's probably the easiest because the definition really says it all. Your class/function/whatever should do one thing and do it well. In this case, I see a bunch of stuff, so my first thought is to break this function up into multiple functions. How do I know how to break it up? Stuff like `if` statements are literal blocks of code. Since everything is essentially scoped to blocks in JavaScript, anything inside of a bracket is a good candidate to break out into its own function. I'm also going to cheat and skip a step, because `if` statements also provide another subtle hint: **the `if` keyword can usually be refactored into a guard clause.** Why? Because it is already guarding some code against being run within your function. If we rewrite the `if` in guard clause notation (meaning, exit early before the function has a chance to run), along with the SRP refactoring, we get: ```javascript const addEvents = (events, label, segment) => events.push([label, segment]); const addUpcomingEvents = (events, upcoming) => { if (upcoming.length === 0) return; addEvents(events, 'upcoming stuff', upcoming); }; const getPastMonths = past => { return groupBy(past, date => moment(date).format('YYYY-MM')); }; const addPastEvents = (events, past) => { if (past.length === 0) return; const months = Object.keys(getPastMonths(past)).sort().reverse(); months.forEach((month, idx) => addEvents(events, `${idx} months ago`, month); }; const main = () => { const [upcoming, past] = partition( arr, date => today.diff(date, 'day') <= 0 ); const events = []; addUpcomingEvents(events, upcoming); addPastEvents(events, past); return events; } ``` ## Mutation is ugly Already this looks way better. `main()`'s single responsibility is populating the array of `events` and returning them. Each extracted function has one responsibility based on its function definition. So I guess we can call it a day, right? The next smell I see is this pattern: ```javascript const myPoorFriendArray = []; // a world of hurt, warping our pristine, empty friend blackBoxFunction(myPoorFriendArray); moreTorture(myPoorFriendArray); theGauntlet(myPoorFriendArray); // we chew him up and spit him out, never knowing the horrors he experienced return myPoorFriendArray; ``` Ya feel me on this one? Our boy `myPoorFriendArray` is now an honorary X-Man: mutated from his original, empty form, into some chimera of various parts of the code. [Kanye said it better than I can](https://www.youtube.com/watch?v=fSJoDuU328k), but this is the stuff I _DON'T LIKE_. Maybe you come from an OO world like Java and mutation is just a regular, everyday occurrence. If you're woke and you come from a functional background like Elm, you can see that mutants are bad. We don't want no X-Men fighting for us on the coding battlegrounds (sorry Gambit, love you). So what's the fix here? 1. **Send the empty array in as an argument to our first add function** 2. **The function will take in the empty array and make a copy** 3. **The copy is filled out and returned** The argument is not mutated, and we remove state dependencies. If you've done any functional programming, terms like _stateless_ and _immutability_ are everyday terms, and that's what we're striving for. Our stateless code doesn't have [side effects](https://hackernoon.com/code-smell-side-effects-of-death-31c052327b8b), a nasty code smell that makes bugs more difficult to track down because state persists across functions. Without state, what you put into it is what you get out of it each and every time. Making your functions predictable, and thus less likely to include errors you didn't test for. Here's what this change looks like: ```javascript const addEvents = (events, label, segment) => events.push([label, segment]); // we didn't touch the middle functions... const main = () => { const [upcoming, past] = partition( arr, date => today.diff(date, 'day') <= 0 ); return addPastEvents(addUpcomingEvents([], upcoming), past); } ``` Now we're getting somewhere! But for the savvy reader, something should still seem off. Can you see what's still wrong? ## Negate the state **We didn't really remove the state from the add functions.** Yeah, we gave it a starting point in our `main()` function, but the `addEvents()` function is still adding events to a singular `events` array which is getting passed around like a rag doll. How do I know this? `Array.prototype.push` is a mutable function. It provides an interface to add stuff to an array and returns the number of elements in the enlarged array. How do we do this in an immutable way? `Array.prototype.concat` is the immutable equivalent of `push()`. It takes an array, combines it with another array, and those two are combined into a newly-allocated array. So we can modify the above slightly to be: ```javascript const addEvents = (events, label, segment) => { if (!label) return events; return events.concat([[label, segment]]); }; ``` Now we never touch our inputs. `events` remains as pristine as it was the day it arrived in this humble function, and instead we use `concat()` to combine `events` with the new array we've made, which is the array of our tuple (a point of clarification: tuples don't exist in JavaScript, but I'm using that term to describe our 2-item array to make the distinction clearer here). Now I'm in audit mode: where else do I need to follow this pattern of not touching the arguments and ensuring I return an `Array` type? ## Provide type safety If you aren't using TypeScript, this is a great way to practice some proactive type safety to ensure that your function returns objects of the same type regardless of the output. That means don't return an `Array` if you have stuff, but return `undefined` or `null` if you exit early (like in a guard clause). Oh crap, I'm doing that. Looks like another opportunity to refactor! ```javascript // unchanged but illustrating that all functions, including this one, always return an Array const addEvents = (events, label, segment) => { if (!label) return events; return events.concat([[label, segment]]); }; const addUpcomingEvents = (events, upcoming) => { if (upcoming.length === 0) return events; return addEvents(events, 'upcoming stuff', upcoming); }; const getPastMonths = past => { return Object.keys(groupBy(past, date => moment(date).format('YYYY-MM'))).sort().reverse(); }; const addPastEvents = (events, past) => { if (past.length === 0) return events; return getPastMonths(past).reduce((acc, month, idx) => { return addEvents(acc, `${idx} months ago`, month), events); }; }; ``` Lots to unpack here. So in `addUpcomingEvents()` all we did was make sure we return an `Array` and not `undefined` (ending with just `return;` is shorthand for returning an `undefined` object). We do this because `concat()` returns an array, so we want to make sure all `return` statements provide the same type of object in any given function. Next I did some refactoring of `getPastMonths()` to handle the sorting an reversing, because the `groupBy` function _technically_ returns an `Object`, and a way for it to return an `Array` is to grab the `Keys` (which is an `Array`) and do our necessary transformations to that array object. Finally, `addPastEvents()` starts out the same as the upcoming function by ensuring our guard clause returns an `Array` type. The next part is a bit wilder. Originally we were taking an array and iterating over it using `Array.prototype.forEach`. The problem is that this iterator doesn't return an `Array` like we need it to. It simply gives us a platform to view every item in our array. We also know that in the end, we want one array object that adds in all of the past events. **When you think about needing to combine things into one and return that combined object, think of using [`Array.prototype.reduce`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/Reduce).** In this case, I knew I needed to add the events (by month) into our `events` array, _and_ return that newly combined array, using `events` as the starting point. The reduce function takes two arguments, a `callback` on how to combine stuff, and an optional `initial` object to begin with. The `callback` is probably the most confusing part - what is this nested `return` trying to do? The `reduce()` `callback` argument has two arguments of its own: an `accumulator` object, which is initialized to our `initial` object (if you leave it out, it defaults to `undefined`), and the `current` item that is being iterated on. There are two additional optional arguments: the `index` of the item you are iterating on, and the original `array` object you called `reduce()` on. Since we need the `index` argument to label our months, I added that in. So with all of that said, the `reduce()` function is basically saying: > For each past month in my array, add it (with its label) onto my accumulated array, which is the cumulation of every previous iteration, starting with my initial `events` array. ## The final result ![Refactored code is good code](/assets/images/blur-close-up-code-546819.jpg) It was at this point I called it a day and was satisfied with my refactoring for code review I mentioned in the beginning. The final product looks like this: ```javascript const addEvents = (events, label, segment) => { if (!label) return events; return events.concat([[label, segment]]); }; const addUpcomingEvents = (events, upcoming) => { if (upcoming.length === 0) return events; return addEvents(events, 'upcoming stuff', upcoming); }; const getPastMonths = past => { return Object.keys(groupBy(past, date => moment(date).format('YYYY-MM'))).sort().reverse(); }; const addPastEvents = (events, past) => { if (past.length === 0) return events; return getPastMonths(past).reduce((acc, month, idx) => { return addEvents(acc, `${idx} months ago`, month), events); }; }; const main = () => { const [upcoming, past] = partition(arr, date => today.diff(date, 'day') <= 0); return addPastEvents(addUpcomingEvents([], upcoming), past); } ``` To summarize, the major things we covered in this refactoring are: 1. **Simplify functions down to a single responsibility** 2. **Eliminate side effects by using immutable functions and not modifying arguments** 3. **Ensure type safety by verifying all return statements provide the same type of object** This gives us code that is easier to read, less prone to bugs, and easier to test. Is there anything else you would refactor? Did I miss something? [Hit me up on Twitter](http://twitter.com/theadamconrad) and let me know what else could make this code better. ]]> 2018-06-14T21:13:00+00:00 If you're really worried about the GitHub acquisition, here's what to look out for https://www.adamconrad.dev/blog/if-youre-really-worried-about-the-github-acquisition-heres-what-to-look-out-for/ Mon, 04 Jun 2018 21:13:00 +0000 https://www.adamconrad.dev/blog/if-youre-really-worried-about-the-github-acquisition-heres-what-to-look-out-for/ You might have heard that at the end of last year we became a part of Microsoft, but our services are still provided under a separate Terms of Service. We’re excited about leveraging Microsoft technology and resources to offer more valuable features and services to you. And I've been on LinkedIn for the last 2 years. I'm neither frustrated nor thrilled. LinkedIn is just as much of a recruiter Zombieland as it has always been. You don't have to trust Microsoft when they say they don't want to change GitHub, but they are not offering any indication that they want to mess with it. # Myth: GitLab and Bitbucket are safer options The other issue I see is that if there is an unwavering hatred for Microsoft, that we must all leave for better pastures. The two biggest players in the space that are still independent are GitLab and Bitbucket. Are they _really_ all that much better than GitHub? ## They're all for-profit businesses The most remarkable thing to me in reading the public's reaction is that people make it seem like GitHub was an open source non-profit company. That because they were the premiere company to publish your open source code, it must mean they were pure in intentions. They're very much a for-profit and primarily make money off of storing private repositories. GitLab and Atlassian are no different! In fact, now that GitHub is part of Microsoft, they're a public company just like Atlassian. Even though the business models for each are somewhat different in how they make money off of the code storage, the fact is each of those companies is closed source, with specific legal terms to designate what is owned by the contributors, and what is owned by the storage company. And they're _all_ trying to make money (and there's nothing wrong with that). ## You can still deploy your code to mega-corps Even if you go with GitLab or Bitbucket, where are you going to deploy your code? Because unless you plan on buying your own servers and scaling like we did in the 90s and 2000s, you're going to use one of a few big players: 1. DigitalOcean. A company whose debt is mostly [financed by megabanks](https://www.crunchbase.com/organization/digitalocean#section-locked-charts) like HSBC and Barclays. 2. Heroku. A company now owned by Salesforce, a publicly traded company worth nearly $100BB. 3. Amazon Web Services. A company with a bigger market cap than Microsoft. 4. Azure. Which is owned by you know who... So even if you self-host your code for the sake of using git with your team, you're still handing over your code to another massive company when you go to turn the lights on for customers. And you may have already done that deed with Microsoft. ## Look at the terms The GitHub terms of service only mention open source [in section D.6](https://help.github.com/articles/github-terms-of-service/#d-user-generated-content): > This is widely accepted as the norm in the open-source community; it's commonly referred to by the shorthand "inbound=outbound". We're just making it explicit. This isn't even in the ownership section, which clearly states: > You retain ownership of and responsibility for Your Content...we need you to grant us — and other GitHub Users — certain legal permissions, listed in Sections D.4 — D.7. These license grants apply to Your Content. If you upload Content that already comes with a license granting GitHub the permissions we need to run our Service, no additional license is required. You understand that you will not receive any payment for any of the rights granted in Sections D.4 — D.7. The licenses you grant to us will end when you remove Your Content from our servers unless other Users have forked it. You can continue to read the rest of those sections, but the real key nugget that you're probably worried about is the legal permissions that apply to you: > We need the legal right to do things like host Your Content, publish it, and share it. You grant us and our legal successors the right to store, parse, and display Your Content, and make incidental copies as necessary to render the Website and provide the Service. This includes the right to do things like copy it to our database and make backups; show it to you and other users; parse it into a search index or otherwise analyze it on our servers; share it with other users; and perform it, in case Your Content is something like music or video. > > **This license does not grant GitHub the right to sell Your Content or otherwise distribute or use it outside of our provision of the Service.** The rest of those sections simply state that if you post public code it can be viewed by anyone, if you contribute to a repo it's bound to the repo's declared license, and when you upload code it comes with "moral rights," meaning you have the right to seek attribution and credit for the code you have published. So how does this compare to ownership and IP for GitLab and Atlassian? [For GitLab](https://about.gitlab.com/terms/#subscription): > 4.3 Customer and its licensors shall (and Customer hereby represents and warrants that they do) have and retain all right, title and interest (including, without limitation, sole ownership of) all software, information, content and data provided by or on behalf of Customer or made available or otherwise distributed through use of the Licensed Materials (“Content”) and the intellectual property rights with respect to that Content. And [for Atlassian](https://www.atlassian.com/legal/customer-agreement): > 7.4 Your Data. “Your Data” means any data, content, code, video, images or other materials of any type that you upload, submit or otherwise transmit to or through Hosted Services. You will retain all right, title and interest in and to Your Data in the form provided to Atlassian. Subject to the terms of this Agreement, you hereby grant to Atlassian a non-exclusive, worldwide, royalty-free right to (a) collect, use, copy, store, transmit, modify and create derivative works of Your Data, in each case solely to the extent necessary to provide the applicable Hosted Service to you and (b) for Hosted Services that enable you to share Your Data or interact with other people, to distribute and publicly perform and display Your Data as you (or your Authorized Users) direct or enable through the Hosted Service. Atlassian may also access your account or instance in order to respond to your support requests. **TL;DR: you own your code.** In fact, if there was one company in that three that I am worried about, it's Atlassian. **Atlassian can collect, copy, store, modify, and use your data.** This isn't spelled out with GitLab or GitHub. So don't think the alternatives are any better for you now that Microsoft owns GitHub. # Assuming it's all good, what can go wrong? So if you're convinced that you don't need to jump ship, the truth is things _could_ go wrong. Here are the things to look for when the deal ultimately goes through and the acquisition is 100% complete: 1. **Look out for a GitHub conversion announcement.** Just because they've announced an acquisition does not mean the services have completely rolled over to Microsoft. In fact, the current terms of service for GitHub are dated May 25th which is pre-announcement. If things change and either GitHub or Microsoft announces the rollover is complete, make sure to read the changes carefully. If they change their terms, they'll make you accept or reject the new terms. 2. **If they introduce new terms, see who owns them.** As I mentioned earlier in LinkedIn's conversion announcement, they stated that they still owned their own separate terms of service. This will likely be the case for GitHub as well. If that isn't the case, you'll have to read Microsoft's terms as well as GitHub's. Like class inheritance, whatever you read on GitHub is layered on top of what Microsoft is applying to their software licenses. 3. **If the terms change, specifically check Sections D.3-D.7 for change in ownership or rights.** That's the core section that deals with who owns and distributes the code. If this becomes more restrictive or unfavorable to your community, that would be a good time to re-evaluate your source code hosting strategy. Is this acquisition a big deal? Yes. Is it a bad deal for you? Probably not. Keep calm and code on, but if the terms change, read them carefully and re-evaluate if they are no longer favorable to your interests. ]]> 2018-06-04T21:13:00+00:00 How to Improve on Naming Contexts in Domain-Driven Design https://www.adamconrad.dev/blog/how-to-improve-on-naming-contexts-in-domain-driven-design/ Wed, 30 May 2018 23:25:00 +0000 https://www.adamconrad.dev/blog/how-to-improve-on-naming-contexts-in-domain-driven-design/ How would I describe `Question` and `Answer` to a friend in one word? First of all, we can look at the description I gave in the beginning. In addition, we remember that contexts are nouns, so when I'm parsing those first descriptive sentences, I see words like **forum** and **site**. Another technique is to bring these models into reality. Where do you see Q&A in the real world? At tech talks, for example, it's usually at the end of a presentation where the speaker has a discussion with an audience. Using the noun technique in the previous sentence, potential context ideas are **presentation** and **discussion**. Now we can apply the domain-driven part of our design to ask ourselves: > What domain does this belong to, and are there any conflicts or ambiguities we need to worry about? For presentation, this belongs to the domain of things like public speaking and business. The problem is, presentations are also conflated with the MVP (Model-View-Presenter) pattern. It is also a specific format for PowerPoint. Given that this word has specific connotations with programming jargon, we can rule this one out. That leaves us with **discussion** and **forum** as our context for our Q&A models. Honestly, either will work here, but one hint is that forum relates more to this particular domain than discussion does, so I think we'll go with forum. Now we can repeat the process for `User` and `Organization`. Since these aren't on the same level, we could argue that they shouldn't be in the same context, but we have to give them _a_ context because there really isn't a point in making a 1-to-1 mapping of models to contexts for an application this small. But we also see a natural mapping here: people belong to organizations all of the time, whether it's a company or a volunteering outfit. How do we describe people in an organization? We say they're organized by some hierarchy and are usually grouped into teams or divisions. That gives me a few nouns: **hierarchy**, **grouping**, **team**, and **division**. I can already see that team and division are immediately out since that is an example of how organizations are split and don't fully encompass the user-organization relationship. Grouping is good, but the word *group* itself does creep a bit too close to computer terms such as `GROUP BY` or the C# `grouping` keyword. So I'll go with **hierarchy**. ![Quora domain modeling example with contexts](/assets/images/quora-ddd.png) Nice work! We have a clear designation between two contexts that make sense. One thing to point out is that **contexts can be renamed or changed later**. It's not important that we anticipate the use of these contexts as we add models. Sure, `Hierarchy` might limit our flexibility, as it could be a stretch to add in helper models like `Address` into this context, but you can worry about refactoring later. ### Example 2: Twitter Twitter is a microblogging service that allows people to express their thoughts in a very limited space. All messages (called "tweets") have a 280 character limit and are broadcast to followers. Followers can then interact with their connection's messages by liking or sharing (via "retweeting") content that they think is valuable to their own networks. ![Twitter domain modeling example](/assets/images/twitter-no-ddd.png) What did you come up with? Let's again apply the same heuristics from above to decide what our contexts will be. The first thing that stands out to me is the intimate tie between `User` and `Tweet`. It's the absolute core of this platform. Simplifying the noun technique, my first thought was that this _is_ the **microblog**. However, given that Twitter has increased its word count from 140 to 280 characters, I feel like the *micro* portion of this is basically irrelevant, but the **blog** portion aptly contextualizes this relationship. Plus, it's a natural context for future potential models like `Comment`. Next, there are the user-related interactions: following/followers is largely abstracted away since it's a `User` to `User` relationship. `Like` and `Share` are new models, and do have similar mechanisms that feel like they could be grouped. What did I just call these things? **Interaction**, that's one (in fact, I also used it in the opening paragraph). I also used **network** which would describe the following/follower relationship, but again, networking is computer jargon and that could collide with a lot of other concepts we are working with. I think **interaction** is a good one: ![Twitter domain modeling example with contexts](/assets/images/twitter-ddd.png) This one was a bit easier given our previous look, but again, I really want to stress that **there is no right answer**. All we are attempting to do here is make it easy for you to figure out the naming, but at the end of the day, there's no actual penalty in naming or grouping these however you like. In Phoenix, there is certainly some cost with talking between contexts, but it's not an insurmountable issue. ### Example 3: Google Drive Google Drive is a cloud file storage provider. You can access files from anywhere in the world on any device, provided you have a Google account. In addition to reading files, you can collaboratively write to files with other users who have permission to read (and write) these files. Further, you can collaborate via tools like comments and sticky notes to provide feedback even if you cannot directly write to the file. Google Drive does not provide universal file capabilities and is mainly focused on reading/writing for documents, spreadsheets, and slideshow presentations. ![Google Drive domain modeling example](/assets/images/google-drive-no-ddd.png) This one is a bit more complex. We have quite a few more models here with a bit more ambiguity for how we can split things up. Let's start with the easiest one that all exist on the same level: `Document`, `Spreadsheet`, and `Presentation`. How did I group those in the above description? I called them all **files**. Normally we'd red-flag this context because it's too computer-specific of a domain, but remember, this service is _very_ computer focused by its very nature, so it's actually okay that we give this a context like this. The next layer up we have our `IO` which handles permissions for file read and write access controls and `Comment` which is a special kind of write. Finally, we have the `User` at the very top, which touches everything. Now your gut might tell you that reading, writing, and commenting are **interactions** just like in the Twitter example. But we can't do that for a few reasons: 1. The methods for determining read and write access are encapsulated within `IO` - they are permission grants to unlock certain functionality, but `IO` in of itself is not an interaction 2. Remember, `User` has access to all of these models. Even though this UML diagram lays everything out in a hierarchy, it's not a simple three-tiered relationship, so in a way, we _interact_ with our files as well by virtue of owning them. Let's offer a different narrative. We mentioned earlier that we can collaborate with other users to perform actions. The noun here is **collaboration**. Additionally, those actions are governed by **permissions**. Which do we choose? This is probably the toughest one of all. Collaboration implies an innate connection with others, even though you can manipulate a file without any other users. Whereas permission makes sense with both user interaction and `IO` but seems to leave `Comment` out in the cold. The key clue we have is that **`Comment` is a subclass of `IO`, so we can lower the priority of making `Comment` work for our context**. In other words, since we know permission is a sensible context for both `User` and `IO`, it stands to reason that **permission** will be a good context name because `Comment` is a type of `IO` in this domain. ![Google Drive domain modeling example with contexts](/assets/images/google-drive-ddd.png) This provides lots of ability to expand with both file types and ways of handling things centered around the `User`, such as authorization and authentication. Both contexts also pretty cleanly separate the levels of hierarchy between data models. ## Wrapping up Let's summarize our heuristic into three simple steps: 1. **Map the domain.** We used UML diagrams to lay out the hierarchy of our models to provide visual clues for possible contexts. 2. **Horizontal or vertical alignment provides clues.** If something falls along the same plane or area of your diagram, you can bet it's a context. 3. **Describe that alignment in one word, preferring the noun with the more specific domain.** This is the hardest part, but the idea here is to take the description of our application and pick out the nouns (and sometimes verbs) that associate our related models. From there, we simply prune the word that has the clearest message with the least ambiguity. Naming contexts is hard. It can be made easier if you follow these guidelines the next time you write an application using domain-driven design. Make sure you don't stop with just your first iteration. Growing applications require consistent refinement. Sometimes you'll need to split off a context into multiple contexts, while other times you'll need to consolidate contexts. Don't be afraid to go in either direction. Finally, don't be afraid to be wrong, because there is no definitive right answer either. Good luck and happy naming! ]]> 2018-05-30T23:25:00+00:00 How to Load Images to Cut Page Time https://www.adamconrad.dev/blog/how-to-load-images-to-cut-page-time/ Thu, 24 May 2018 18:25:00 +0000 https://www.adamconrad.dev/blog/how-to-load-images-to-cut-page-time/ 0) { imageCount--; observer.unobserve(entry.target); preloadImage(entry.target); } } } var applyImage = function(img, src) { img.classList.add('js-lazy-image--handled'); img.src = src; } if (!('IntersectionObserver' in window)) { loadImagesImmediately(images); } else { observer = new IntersectionObserver(onIntersection, config); for (image in images) { var image = images[i]; if (image.classList.contains('js-lazy-image--handled')) { continue; } observer.observe(image); } } ## How it works This script works by looking for images that have the class **js-lazy-image**. Those images must have a data attribute called data-src which has the location of the image, just like the normal src attribute would be filled out. The script then counts the number of images with this class, and when the viewport intersects with an observable area that includes the image, the source of the image is loaded from it’s data attributes, and the image renders on the screen! Here’s an example of how you would call lazy loaded images: Lazy As you can see, this works almost exactly like a regular image, but you just need to make sure to add on the special JavaScript class and add a **data-** prefix in front of your src attribute. And that’s all it takes to go from this: ![](https://cdn-images-1.medium.com/max/3998/1*ZIW_GhTLsCzE23pXO_7ulg.png) To this: ![](https://cdn-images-1.medium.com/max/3998/1*jVANxQdyxpl8UQ3gmmLr8Q.png) ## How the script works — section by section Still curious as to how the script works? Let’s break it down by section: var images = document.querySelectorAll('.js-lazy-image'); var config = { rootMargin: '50px 0px', threshold: 0.01 }; var imageCount = images.length; var observer; Here we initialize our variables. Remember that class we added to our images? This is where we collect our images. The configuration will be used later with our observer class (don’t worry if you don’t know what that is, we’ll get to that in a second). Finally, we store the image count because we’ll be using that a few times throughout our script. var fetchImage = function(url) { return new Promise(function(resolve, reject) { var image = new Image(); image.src = url; image.onload = resolve; image.onerror = reject; }); } Our first function deals with grabbing our images. We make use of [JavaScript Promises](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Using_promises) to asynchronously load our images. If the Promise was resolved successfully, we load the image with the URL for our image source, otherwise we display an error. So where is this called from? Glad you asked… var preloadImage = function(image) { var src = image.dataset.src; if (!src) { return; } return fetchImage(src).then(function(){ applyImage(image, src); }); } This image preload function grabs the image source from the data attribute we tacked onto our image tag. If it doesn’t find a source object, no problem, we just stop right there. Otherwise, we fetch our image, and if things go great, we apply the image to the DOM like so: var applyImage = function(img, src) { img.classList.add('js-lazy-image--handled'); img.src = src; } Our function simply finds the image we’re about to reveal, adds on a class to let our script know it has been handled, and then adds the URL to the image’s src attribute. if (!('IntersectionObserver' in window)) { loadImagesImmediately(images); } else { observer = new IntersectionObserver(onIntersection, config); for (image in images) { var image = images[i]; if (image.classList.contains('js-lazy-image--handled')) { continue; } observer.observe(image); } } This is the meat-and-potatoes of our script. The core functionality that enables our lazy loading to happen quickly and efficiently revolves around the [Intersection Observer API](https://developer.mozilla.org/en-US/docs/Web/API/Intersection_Observer_API). This API allows us to track changes to the viewport against target elements asynchronously. Traditionally you would have to do something like this with the [Scroll Event](https://developer.mozilla.org/en-US/docs/Web/Events/scroll) which is both slow and called constantly. To combat this performance hiccup, you might using [debouncing or throttling](https://css-tricks.com/debouncing-throttling-explained-examples/) to limit the number of scroll event requests. But with Intersection Observers, this is all handled for you. Remember the config variable at the top of our script? This is the configuration for our Intersection Observer. It tells us that when we are within a margin of 50px (vertically) from our image, that is when we want to activate our observation callback. The threshold, as you might guess, is the tolerance for what percentage of the object must be observed in order for the callback to be invoked once the margin is reached. In our case, we chose 1%, which is immediately upon bringing the image tag into view. So now that we have that background, we can see how this if statement works. If we see that Intersection Observers are a part of the window object, we know we are in a browser that supports this functionality. As of right now, [Intersection Observers are available on all major browser except IE and Safari](https://developer.mozilla.org/en-US/docs/Web/API/Intersection_Observer_API#Browser_compatibility). So if you are on IE or Safari, you will load the images immediately. If you aren’t, we create a new Intersection Observer object with the configuration we provided in the beginning, as well as a callback function to trigger when our target observation is reached. Finally, we have to tell the Observer exactly what it has to observe for the callback to be initialized. In this case, we are observing all of the images that haven’t already been handled, which are the images that haven’t been applied to the DOM yet (via the applyImages function we saw earlier). So what does loading images and the observation callback look like? var loadImagesImmediately = function(images) { for (image in images) { preloadImage(image); } } For loading images immediately, it’s pretty straightforward. We simply preload all of the images and put them on the screen like we normally would. var onIntersection = function(entries) { if (imageCount === 0) { observer.disconnect(); } for (entry in entries) { if (entry.intersectionRatio > 0) { imageCount--; observer.unobserve(entry.target); preloadImage(entry.target); } } } Our intersection callback is a bit more involved. If we have loaded all of our images that have our lazy loading CSS class, we’re done and we can disconnect from our Observer object. Otherwise, for every observer entry in our IntersectionObserver, we want to activate our images. We do that by ensuring we have reached our threshold. **intersectionRatio** is the property we need to see if our target image element is visible within the threshold we defined in our configuration. If it is, the property returns a 1, otherwise it returns a 0. So if we have landed within our necessary ratio, we have one more image to add to the DOM, which means we can remove 1 from our count of images remaining to load. We can therefore stop observing this image because it’s going to be loaded onto the page. Finally, we use our previously-defined preloadImage to execute our now familiar process of adding the image URL onto the image tag and loading the image into the DOM. ## Next steps **Lazy loading images is a quick and painless way to drastically improve the performance of your pages that use a lot of imagery.** From there, be sure to [compress your images](https://adamconrad.dev/the-fastest-way-to-increase-your-sites-performance-now/) and use the correct image format so you keep a small footprint. Tools like [MachMetrics](https://www.machmetrics.com/) are a great way to track your performance improvements over time, as well as provide additional suggestions on how to continuously improve the performance of your application. What other quick wins do you have for speeding up your site? Leave a reply in the comments below! *Originally published at [www.machmetrics.com](https://www.machmetrics.com/speed-blog/how-to-lazy-loading-images-script-slash-page-load-time/) on May 24, 2018.* ]]> 2018-05-24T18:25:00+00:00 Here is the Full List of My 50+ Remote Job Sites https://www.adamconrad.dev/blog/here-is-the-full-list-of-my-50-remote-job-sites/ Tue, 15 May 2018 23:25:00 +0000 https://www.adamconrad.dev/blog/here-is-the-full-list-of-my-50-remote-job-sites/ 2018-05-15T23:25:00+00:00 The Fastest Way to Increase Your Site's Performance Now https://www.adamconrad.dev/blog/the-fastest-way-to-increase-your-sites-performance-now/ Wed, 25 Apr 2018 04:25:00 +0000 https://www.adamconrad.dev/blog/the-fastest-way-to-increase-your-sites-performance-now/ JPEG 2000 logo WebP logo At this point, you've already exhausted even the most advanced compression optimization techniques. And while these programs do all of the work for you in a matter of seconds, the truth is you may need to scrap your images altogether if they are in the wrong image format. How do you know if they're in the right format? Follow this simple questionnaire and then save your image into the new file format (and then repeat levels 1 and 2 to compress your new images!). #### Does it look like a photograph? Use a JPG JPGs were meant to be used with photographs. If you have avatars of your team, photographs of your office, or other real-life images, make sure they are in the JPG format. #### Does it look like a computer-generated graphic or drawing? Use a PNG Everything else can be a PNG. Other formats may provide better quality, but if you're serving a web or image application, a PNG will do and is universally read by every kind of application. The one exception to this would be icons... #### Does it look like an icon? Use an SVG Icons are also computer-generated, but the difference is that icons are generally used alongside typography. If you think the image will be used in a menu, on a button, or is meant to symbolize something, it's probably an icon, and icons will benefit from being SVGs because they can be scaled along with your type and lose 0 fidelity. They will look just as crisp as your fonts and will be much smaller as SVGs as well. #### Are you supporting the latest browsers and don't care about Firefox? Use WebP, JPEG 2000, and JPEG XR Finally, there is a push for next-generation image formats. JPG and PNG have been around for more than two decades, and it's about time we have some new formats that innovate to maintain image quality without bloating our applications. Even better, they don't require you decide between image type. For example, WebP works great for both photographs and computer-generated images. The downside is that support is fragmented across devices and browsers. [WebP](https://developers.google.com/speed/webp/) was made by Google, so it's naturally designed only for Chrome and Chrome mobile. JPG also has evolved formats, but [JPEG2000](https://jpeg.org/jpeg2000/index.html) is only supported by Apple (Safari and Safari Mobile), while [JPEG XR](https://jpeg.org/jpegxr/index.html) is only supported by Microsoft (IE and IE Edge). What about Firefox? There is no next-gen format for this browser, but they do have a [2-year-old bug ticket](https://bugzilla.mozilla.org/show_bug.cgi?id=1294490) to implement WebP and it is assigned, but who knows when this will land. ### Level 4: Source sets & fallbacks If you've chosen the correct image and you've compressed the heck out of it, you'll want to make sure it's being served on the right device in the right aspect ratio. I alluded to this back in Level 2, but if you have concerns about Retina displays like MacBook Pros, or even x3 quality for the newest iPhones, you'll want to ake multiple copies of your image in all of these formats. Going back to the previous example, if you serve avatars in a 64x64 JPG format, you'll also want to make copies of dimension 128x128 for Retina _and_ 192x192 for Retina x3. The best way to do this is to **start with a larger image and scale down, rather than scale up**. You know those crime dramas where they ask the hacker to "_Zoom! Enhance!_"? We all know that doesn't work in real life, and that same thing holds true for your images - you can't add clarity where there was none in the first place. Instead, start with the original source image (say, 512x512) and scale down to 192, save a copy, then 128, save another copy, then 64, and save that last copy. This will result in a less blurry, albeit still lossy (because you are removing information in the form of pixel fidelity) set of images. With all of these duplicate, scaled images, how do we tie this all together? The image attribute known as `srcset` comes to the rescue: ``` Your responsive photo avatar ``` `Srcset` is pretty amazing. It will always default to the original `1x` magnifier like a normal image tag would. However, if it does find the other images, it will apply them to the given aspect ratios for your device. In other words, if you are viewing the photo on a 2015 MacBook Pro, the browser will select `avatar-md.jpg`, but if you are on an iPhone X, it will select `avatar-lg.jpg`. And if you're in a military bunker using IE8, it will fall back to `avatar-sm.jpg`. `Sizes` is another property for responsive images but it relies on the width of the device rather than the pixel density. The format is the same: ``` Your responsive background ``` You specify an image, space, the size the image should be displayed, with a `w` descriptor for the `srcset`, and then, using the `sizes` attribute, specify the media queries at which point the various sources should be used. The only downside? **`Srcset` is not supported in IE**. It _is_ supported in IE Edge, and the `sizes` attribute is supported everywhere. In my honest opinion, this isn't something to worry about because all of the Retina devices using IE are already new enough to support IE Edge. Anything that still requires IE11 and down likely doesn't have a Retina display anyway (unless it is connected to an hi-density external monitor) so you likely won't run into this problem being a real blocker for you. ## Something is better than nothing This is not an exhaustive list of image optimization techniques nor is it meant to be a prescriptive formula. **Even if you only run ImageOptim on all of your images in your application, you could be saving upwards of 60-80% of what your users have to download.** Images, audio, and video, _not_ your source code, will comprise the largest space for your application by far. Effectively choosing, compressing, and displaying your images will have a marked impact on both application size and performance for your end user experience. The best part is it only takes a few downloads and a few more seconds to run your images through a very easy set of tools that will provide an instant improvement without sacrificing quality. ]]> 2018-04-25T04:25:00+00:00 2022-03-29T14:01:00+00:00