Today I have a nice SQL tip about sampling tables in SQL. (I’m writing about PostgreSQL as usual but this applies to other databases, too, since this feature is in the SQL standard.) Imagine you have a large table and you want to get a general feeling about what its contents look like. Saying select * from my_table limit 10 may be a nice idea, but it doesn’t always suffice. For example, if you have a table where you never delete or update rows, only insert them, that query will often just show the earliest 10 rows, and you might want to see how the contents of the table changed over time.
You can say, for example,
select * from my_table tablesample bernoulli (5)
and see about 5 percent of rows from table my_table. (It’s approximate because each row is included in the result with a probability of 5%, so depending on the RNG there may be more or fewer rows in the result.) The mysterious bernoulli keyword tells Postgres that the decision whether to include a row or not must be made for each row individually; if you use system instead, these decisions are made for pages (by default, a page contains 8kB of data, so if your row contains, say, half a kilobyte of data on average, then a page may have about 16 rows – or probably 15, since there is some overhead for the page header), which is “less random” but much faster.
One more thing worth knowing is the repeatable keyword, which allows to provide a random seed. If you say
select * from my_table tablesample bernoulli (5) repeatable (1337)
you will get the exact same 5% of rows of the underlying table every time (provided that you don’t make any changes to its contents, of course).
That’s it for today, see you next time!
Sometimes you need to create an array with the sequence of numbers, say from 0 to 9. Popular libraries like Lodash or Ramda have a function for that (in both cases, it’s called range), but how to do that concisely in vanilla JS? In case you’re wondering, both Lodash and Ramda use a while loop for this task; can we do better? (In this case “better” does not necessarily mean “faster”, but “shorter”, and – let’s be honest — “in a more clever/hackish/crazy way”.) It turns out that the answer is yes. The idea is to use Array.from with an array-like object possessing only the length property. Array.from has an optional parameter – a mapping function, which is called for every element of the “source array”. Of course, in the case of the {length: ...} object there are no elements, so the argument to the mapping function is effectively undefined. However, this function also receives a second argument, which is the index of the element in the “source array” (much like in the regular Array.prototype.map). This means that we can do this:
Array.from({length: 8}, (_, i) => i)
Interestingly, unlike the regular map, the mapping function in Array.from does not receive the whole “source array” as the third argument. Go figure.
Of course, it’s still possible to use this trick to generate sequences not starting at 0 and with step different than 1 – but the code becomes longer. Here is a snippet generating the sequence of all two-digit odd numbers:
Array.from({length: 45}, (_, i) => 11 + i * 2)
Is this all Array.from can do? Of course not. Another use-case is deduplication (via a set):
Array.from(new Set([1, 1, 2, 3, 5, 8])) // => [ 1, 2, 3, 5, 8 ]
Yet another is splitting strings into individual characters. A popular way to achieve that is to say string.split(''), but this won’t work with certain Unicode characters:
'I🧡JS🤣'.split('') // => [ 'I', '\ud83e', '\udde1', 'J', 'S', '\ud83e', '\udd23' ]
Array.from('I🧡JS🤣') // => [ 'I', '🧡', 'J', 'S', '🤣' ]
In this case, Array.from is not indispensable – a similar effect can be achieved with the spread operator:
[...'I🧡JS🤣'] // => [ 'I', '🧡', 'J', 'S', '🤣' ]
but you could argue that Array.from is slightly more idiomatic since it makes the intent of “converting into an array” more explicit.
That’s it for today – see you next week!
A bit over a decade ago (!) I wrote about Iedit. It’s a very cool package, a bit similar to multiple cursors, very convenient for changing variable names (especially that it has a great feature where the change is restricted to the current function).
I am also a Lispy user. Lispy requires Iedit (and has a binding for it different from the default one – M-i – while Iedit’s default I’m used to is C-;). The problem is, when I added Lispy to my Emacs, it disabled the default C-; (and only installed M-i in Elisp buffers).
Now, I admit that M-i may be a better (or at least not worse) keybinding for Iedit than C-;. It’s default binding is tab-to-tab-stop, which is one of those useless commands Emacs has had probably for decades. Personally, I’m accustomed to C-;, so I wanted Lispy not to interfere with Iedit setting that keybinding.
It turns out that all I had to do was to make sure that Iedit is loaded before Lispy. I think the reason is that Lispy says
(setq iedit-toggle-key-default nil)
and Iedit says
(defcustom iedit-toggle-key-default (kbd "C-;") ; ... )
and defcustom only sets the value when not already set. So, if you want both Iedit in non-ELisp buffers and Lispy, just use your favorite package manager to make sure that Lispy is loaded after Iedit and you’re done! And if you want M-i to be the Iedit entry point instead of C-; in all buffers, just make sure that
(keymap-global-unset "M-i") (setq iedit-toggle-key-default (kbd "M-i"))
is evaluated before loading Iedit. (The unsetting is needed because Iedit won’t let you use a keybinding which is already taken by another command.)
That’s it for today, and if you don’t use Iedit (or Lispy), definitely check those packages out – they are both really great!