Today, Iâd like to share a paper and a story that made me go through many (emotional) phases of research. The paper is Variational Inference for Uncertainty Quantification, written with Lawrence Saul and Loucas Pillaud-Vivien.
Disclaimer: During the job search, this was not the paper I spoke about :) My favorite paper (that Iâve written) is probably still the nested $\widehat R$ paper. But when I got home from an interview, I started thinking, the VI for uncertainty couldâve also been a good paper to discuss and I drafted this blog post way back in February 2025.
The paper has now appeared in the Journal of Machine Learning Research. Hereâs the abstract:
Given an intractable distribution $p$, the problem of variational inference (VI) is to find the best approximation from some more tractable family $\mathcal Q$. Commonly, one chooses $\mathcal Q$ to be a family of factorized distributions (i.e., the mean-field assumption), even though $p$ itself does not factorize. We show that this mismatch can lead to an impossibility theorem: if $p$ does not factorize and furthermore has a non-diagonal covariance matrix, then any factorized approximation $q \in \mathcal Q$ can correctly estimate at most one of the following three measures of uncertainty: (i) the marginal variances, (ii) the marginal precisions, or (iii) the generalized variance (which for elliptical distributions is closely related to the entropy). In practice, the best variational approximation in $\mathcal Q$ is found by minimizing some divergence $D(q,p)$ between distributions, and so we ask: how does the choice of divergence determine which measure of uncertainty, if any, is correctly estimated by VI? We consider the classic Kullback-Leibler divergences, the more general $\alpha$-divergences, and a score-based divergence which compares $\nabla \log p$ and $\nabla \log q$. We thoroughly analyze the case where $p$ is a Gaussian and $q$ is a (factorized) Gaussian. In this setting, we show that all the considered divergences can be ordered based on the estimates of uncertainty they yield as objective functions for VI. Finally, we empirically evaluate the validity of this ordering when the target distribution p is not Gaussian.
(February 9th 2025)
A little over two years ago, I started working on variational inference (VI). As I read the literature, I kept coming across the comment that âVI underestimates uncertaintyâ and I wanted to convince myself of this fact with a simple example. Say I approximate a non-factorized Gaussian with a factorized Gaussian, can I show that the approximation always underestimates the marginal variances? So I scribbled and scribbled, and I didnât get anywhere. No problem: VI legend Lawrence Saul was down the hall (at the Flatiron Institute), and I submitted this textbook problem to him. He agreed the result seemed elementary. In fact, many books and review papers proved the claim in 2-D, and it seemed straightforward to generalize it to higher dimensions.
So we went to the black board, and we tinkered, and tinkered, and⊠hmmm⊠could this problem be harder than we thought?
We did eventually write a proof: it was short but unintuitive. (I didnât like this proof.) (The excellent paper by Turner & Sahini (2011) has a statement of the result, albeit without a proof.) Then we derived more results on variance estimation and wrote a precursor to our paper on VI uncertainty.
Lawrence felt strongly that in addition to variance we should also analyze entropy as a measure of uncertainty. And we did find that VI also underestimates entropy. But when I ran numerical experiments, something surprising transpired. As dimension increased, I found that estimates of the variance became worst, while estimates of the entropy improved. This contradicted our visuals: the volume of the approximating sphere was clearly much smaller than the volume of the target ellipsoid. For entropies to match, you would need the volumes of the two objects to be the same.
I checked and double-checked my code, and I couldnât find an error. And then I had a happy thought. The two dimensional picture we were looking at was misleading. The sphere was not smaller than the ellipsoid, even though its shadow was. Each time we âaddedâ back a dimension, the sphere would grow in every direction, while the ellipsoid would only grow a little bit. Until eventually, the volume of the two objects nearly matched.
Hereâs another way to understand the result. Setting correlations to 0 (as one
does with the factorized/mean-field approximation) increases entropy. If our
goal is to match the entropy of the target, the increase in volume caused by
the null correlation must be compensated by a shrinkage in the marginal variances.
Conversely, matching the variances means overestimating the entropy of the target. The two
measures of uncertainty therefore compete with one another. This fact is
elegantly captured by an equation we termed the shrinkage-delinkage trade-off.
Here it is, without any proper definition of the terms but just to highlight its simplicity:
I presented this result at the UAI conference 2023 in Pittsburg. It was a fun
conference, mostly because I met some amazing PhD students and postdocs to
hang out with. I should now mention that all the work Lawrence and I did was
based on minimizing the reverse Kullback-Leibler divergence,
So back to the blackboard (we have beautiful blackboards at the Flatiron Institute). I looked at a few divergences and even some metrics. Did you know there is not analytical expression for the total variation distance between two multivariate Gaussians? It turned out the KL-divergence was particularly easy to manipulate, with other divergences posing additional challenges. Still, we made progress, one divergence at a time.
I had another small breakthrough on Thanksgiving. I remember I was taking the bus to Pennsylvania. I had missed my original bus and then, I received an email, telling me another paper of mine had gotten rejected. I read the reviews and felt absolutely gutted. I can still see it: the overcrowded bus terminal, the endless waiting, the phone call I made to a close one, the reviews I read and re-read and re-read. It was a sunny day. I was fortunate to be with family that day. And the next morning, I woke up early and felt the need to make amends for my âfailureâ. I sat in the kitchen and extended the shrinkage-delinkage trade-off to a three-way impossibility theorem between precision, variance, and entropy.
I had intended to submit a manuscript by the end of November, with the hope that within a year, my paper would be reviewed and accepted, and that this would strengthen my application for my next job. (None of this happened.) Some of the results resisted us. Lawrence and I had an incredibly difficult time proving an âorderingâ of the Renyi $\alpha$-divergences. We felt close: each week, we believed we would finish the proof; and each week, the last piece of the puzzle eluded us.
One day, I was working at the white board, messing around with the terms of an equation. My then fellow post-doc Loucas Pillaud-Vivien walked by and asked me what I was working on. So I explained the problem. He then grabbed a marker and began working in his corner of the white board. He shared his ideas, his perspective as a theorist and a probabilist. He spoke about âagreeable factsâ, he dug out results from linear algebra that I wasnât familiar with. It was also fun for me to revisit the topic in French, somehow it gave me a fresh perspective. (Loucas was also French.)
And so, Loucas joined forces with Lawrence and I. I remember the day when we finally cracked the proof. It was a Friday. We felt close (as always) and it was getting late. Another one of our colleagues brought beers and we drank them in front of the white board. Now we were one or two details away from the contradiction that would complete the proof. But I had a ballroom dance practice and I left the office.
Later that night, while I was warming up at the dance studio, I got a message from Loucas: âI think I got it. You can dance in peace.â Dance in peace?? No, I had to see the proof for myself before being âat peaceâ. Right after practice, Loucas and I met at the usual Mexican bar and he completed the proof on a napkin, which is a cliche, but we had ran out of paper.
After that, it took me quite a bit of time to check all the proofs, scattered in my notebooks, and patch together the missing details. And of course, Lawrence made an heroic effort editing the manuscript and making sure it lived up to his very high writing standards. I suppose the story is far from finished, since the paper is still under reviewâŠ!
(November 11th)
I donât believe thereâs anything extraordinary about this project. Itâs a typical good research story and illustrates one of the happy collaborations I had as a postdoc. Some of the results seem elementary, or to use a more flattering word, fundamental. (A reviewer criticized results I had in another paper as âelementaryâ and it had never occurred to me that this word could have a negative connotation.) I trust we will see the results we derived in upcoming textbooks. With time, I was able to simplify several of the proofs. At this point, I am aware of four proofs to show VI underestimates variance, each offering its particular perspective.
p.s. The paper was desk-rejected from a first journal. We then revised it quite a bit and sent it to JMLR. After a few months, we got requests for some minor revisions and the paper was published another few months later.
]]>Last fall, I went on the academic job market and applied for tenure-track faculty positions primarily in Statistics. Iâm happy to report I received multiple offers and accepted a position at the University of British Columbia in Vancouver đšđŠ
Iâve benefited a lot from the guidance of mentors and peers, and the occasional blog post. A blog post is no substitute for the advice of a seasoned academic, still it may provide unexpected tips and it can make the job search feel like a less lonely endeavor. With that in mind, Iâll contribute my brick to the edifice and describe my personal experience.
I like to call the job search the job market campaign. Because it will take up a lot of your time and energy. Itâll be the proverbial full-time job. And so, here are two advice I received and Iâm happy to pass on:
Donât get me wrong: you still need to lay the groundwork before hand. That means doing research, attending conferences, advertising that you will be applying for positions (that takes courage but do it; you want to hear about opportunities.) And working out who your three/four letter writers will beâkeep in mind that some institutions require someone to write about your teaching.
Another good piece of advice is start acting like a professor (this is pretty much the one thing I remember from reading The Professor is in when I was a PhD student). That can mean several things but essentially: take charge. Organize conference sessions, invite seminar speakers, lead your research.
Hereâs some wisdom from my advisor: only apply if there is a chance you would accept an offer. Otherwise, youâre wasting their time and your time. That said, you should keep an open-mind. The chance of accepting the offer need not be high. Think about your hard constraints and your soft constraints. For example, a hard constraint might be living with a significant other. A soft constraint might be: you prefer a city to a small town. If something doesnât meet your soft constraints, that shouldnât prevent you from applying to an otherwise good department.
The other thing is that youâll learn a lot about yourself while youâre applying: as you visit universities, as you (fingers crossed!) consider multiple offers, as your personal situation evolves, heck even as geopolitics change⊠Youâll learn what your priorities are as you go through the process.
I first made a list based on universities I was a familiar with, mostly because I knew at least one good professor there. If I was in good terms with that professor, I would reach out to ask them about the job opening and whether they thought I would be a good fit. (Often, I had already had this conversation with the professor at a conference or seminar.)
Then, I added research universities in locations that I liked and where I could imagine myself living. I looked at listings such as asa career, imstat jobs, and statistics jobs.
Once I compiled the list, I went over it with my letter writers and got some additional recommendations. Then I put together a big XL sheet, wrote down deadlines, and unconventional requirements.
I applied to about 50 places. Some people think itâs not enough, others that itâs too much. I was applying to different countries, so that increased the number of places I was willing to consider. Whatâs more, many institutions require the same documents, so you can streamline the process.
The standard requirements are: a CV, a cover letter, a research statement, a teaching statement, sometimes a diversity statement, and at least three letters of recommendation.
Here are some recommendations:
Regarding the cover letter, I received two contradictory advice, which Iâm compelled to share:
I really like this last advice. I find it considerate. However, given the volume of places I was applying to, I couldnât do it for every university. But they were a handful of places, where I felt a particular affinity and I wrote down why. (In the end, I got interviews at both places that received a custom cover letter and ones that got a generic letter.)
Final advice: proof-read, read your statements out loud, and have two/three friends or colleagues proof-read your application. I was incredibly lucky in that respect. I had great writers read my material with patience and kindness. If you ask for more feedback, youâll start getting contradicting advice. And also: no advice is sacred. Write in your own voice.
If youâre fortunate enough, youâll get interviews and youâll need to hold a seminar. There are a lot of advice on how to give a good talk and often academics blatantly disregard them. I believe that, as a community, we should all invest more time into preparing good talks. Think about how much better conferences would be! But until that happens, you have an opportunity to distinguish yourself as a competent speaker đ
To me, the most valuable resource has been this lecture by Patrick Winston. I would try and implement every piece of advice, at least as an exercise. You can then adjust to your style.
Beyond that:
Part of the on-campus visit will be one-on-one interviews. I donât have too much advice here. Did I know everyone who was going to interview me? No. In fact, the job search made me realize I didnât know the vast majority of people in my field.
Most professors prepare the interview and have a set of topics they want to discuss. When that happens, roll with that. The chair will usually cover the big topics (funding, tenure, etc.). Professors you knew beforehand and young faculty are people who can give you the inside-scoop.
If youâre interviewing in the winter, bring some cough drops. You and your interviewer might need them.
I had one or two tough interviews and one or two what-should-we-tak-about awkward meetings. But overall, interviews were straightforward and pleasant. People were nice, interesting and interested.
I also really liked the dinner. Yes, donât get drunk, donât be a slob, this is still part of the interview, blah blah blah. But relax. Be yourself. If you donât BS them, they wonât BS you. They want to know if you would be a good colleague to hang out with. Everyone at this table wants to have a pleasant dinner. Iâd think of it as a small celebration: both you and them put in a lot of hard work for the interview. As a reward you get a three-course meal and a glass or two of wine. (I donât know if they do it for every candidate or just because Iâm French, but I was consistently asked to pick a wine for the table, so maybe I have some elementary notions of wine pairing.)
The job market campaign is long. If youâre like me, youâll mostly stop doing anything else at work to focus on it. Still: your colleagues will ask for your help on a project, youâll get a damming review that requires a response⊠and more than that, life wonât stop. Youâll probably be going through your own set of personal challenges.
Of course, the market itself will be trying. For some time, I wondered why I wasnât hearing from some places. Then I freaked out about scheduling interviews scattered across the world. I questioned whether I had the right priorities: should I move back to Europe to be close to family? Should I stay in New York where I had lived for eight years? Should IâŠ?
When I got my first job offer, I broke down crying. Not tears of joy. I had just finished a full day interview and was prepping a âfuture job talkâ for my next university visit (I really procrastinated on that one and stayed up until 2 am). I was completely depleted. Of course, I was happy about the offer itself. But not relieved: the offer came with an exploding deadline and it seemed likely I would have to turn it down. At that point, I experienced total mental overload. Also, I was sad, because I always imagined that I would be surrounded by people I love to celebrate, if I one day I got a faculty job offer. Not alone in a hotel, exhausted and preparing the next interview.
In a both good and bad way, the job market campaign keeps you busy and forces you to move forward. Traveling gives me a lot of peace: this is where I meditate, look out of the window, and sometimes chat with other travelers (who will find it super exciting that youâre trying to become a professor).
Hereâs my last set of advice. This one is more personal, so obviously only take whatâs useful for you:
The outcome of the job search does not define you. A lot of it is outside of your control and this seems more true today than before. After I received my offers, many university began hiring freezes. This meant that some of the offers I turned down did not go to the next candidate, as wouldâve been the case in a less chaotic year.
When I signed my offer at UBC, a colleague wrote to me: âWell deserved, but the process doesnât always work the way it should, so glad to hear that it worked out.â
Some of the best researchers I know did not get faculty positions and still went on to do influential work.
And hereâs one more small set of fun facts: the first time I applied to grad school, I got no offers. I was the only one in my PhD cohort to fail their qualifying exam at the end of my first year, meaning I had to take it again. At the end of the PhD I applied to a few universities for professor positions and got zero interviews (the toe dipping I donât recommend). When I applied this year, I attended seven in-person interviews, which resulted in five offers.
]]>Iâm a fan of historical novels and this is not my first book by Follett. (Iâve read A Column of Fire, Fall of the Giants and Winter of the World.) His books are extremely well-written and captivating: I find them to be wonderful companions when I travel. Iâve also recommended them to a few friends, including ones who are less in the habit of reading, and theyâve gone on to read several books by Follett.
A quality I enjoy in Follettâs books is that he lets us witness historical events through the eyes of ordinary folks. Sometimes these characters end up playing an instrumental role (in Column of Fire, one of the main antagonists essentially causes the St. Bartholomewâs Day massacre). Often times, the characters merely endure events that surpass them. They have little agency in the unfolding of these events and yet they fully experience their consequences. The Armor of Light, more so than other books Iâve read by Follett, emphasizes this point.
The book mostly focuses on the town of Kingsbridge and how its habitants deal with the impact of the Napoleonic wars (higher taxes, inflation, conscription, and anti-union laws for fear of seeing the sparks of the French revolution spread in Great Britain). The book doesnât go too deep into how the characters feel about the french revolutionâsome express sympathy for the uprise against aristocracy and the book often questions the competence of leaders who have inherited their positions rather than earn them; others feel they have a patriotic duty to defend their country against a potential French invasion. But the characters mostly focus on how to improve their livelihood. They fight either to give more rights to workers or deprive them of it; they seek to educate or be educated; they struggle to feed their children; or they compete to earn an army contract to supply uniforms for the army.
Another major theme in the book is the introduction of machinery in the weaving industry. Naturally, the benefits of the technology are hardly distributed: the business ownersâwho granted, invest and take the riskâreap most of the benefits; the workers on the other hand are ruthlessly sacked, lose their employment, and find themselves impoverished by the new technology. The more reasonable employers, who care about the well-being of their employees, are forced to follow suit in order to stay competitive and keep their business afloat. The book introduces an unusual character (a working class child in the first act of the book) who becomes an able engineer, earns his keep selling machines and later finds himself at odds with his step father, who lost his position at a mill.
A notable choice is that the book almost exclusively focuses on people in Kingsbridge. This is to be contrasted with A Column of Fire, the previous volume in the Kingsbridge series, whose characters are scattered across England, France, Spain and more. I went into Armor of Light expecting the same. When I saw the book started in 1792, I hoped to read about the rise of a working class Frenchmen in the ranks of the revolutionary armyâone whose perspective would contrast with the British experience of the war; or perhaps a pupil of Beethoven in Vienna, at first enthusiastic about the French republic and later disappointed by the French empire. But Follettâs decision to only gives us Kingsbridgeâs perspective is effective: it portrays the war as a distant, almost intangible thing that still completely disrupts the daily life of the protagonists.
One reservation I had while reading the first half of the book is that the novel clearly tells us which characters to root for and which ones to dislike. There is nuance of course: some characters have tragic backgrounds; others are flawed but the novel signals that they are good-hearted and that we should not judge them too harshly. But some characters seem simply there to be disliked. The first chapter already depicts one such characters as absolutely despicable. He becomes a formidable adversary to one of the protagonists. Emotionally, this is effective: it makes us root for a character, it creates suspense and a conflict whose resolution we care about. But it also makes the antagonist seem flat. A mediocre and yet incredibly destructive being. An unrelatable person. I prefer it when the characters can be understood and we can have some sympathy for themâeven if we ultimately disagree with their actions. This bothered me a bit but it certainly did not stop me from reading. Which is good because most of the characters do eventually change, undergo their arcs, even though it takes many pages or many years in the story. Sometimes, the arc carries out across generations, with the children refusing to live as their parents did, which is always a powerful theme. All in all, the novel reminded me that life is long, very long, and that many things will change as the decades march by.
In conclusion, it was a very enjoyable and thought-provoking read. Even though the novel is set in a historical period, much of its topics seem particularly relevant to todayâs society. I like remembering that some of the challenges we face are not as new as they might seem. And of course, the book takes us into the innermost worlds of its characters: it is fascinating to see their perspectives on historical events and even more so to simply witness their humanity.
]]>