Categories: geek » emacs

View topic page - RSS - Atom - Subscribe via email
Recommended links:

Org Mode: Export HTML, copy files, and serve the results via simple-httpd so that media files work

| emacs, org

In Org Mode, when you use "Export to HTML - As HTML file and open", the resulting HTML file is loaded using a file:// URL. This means you can't load any media files. In my post about pronunciation practice, I wanted to test the playback without waiting for my 11ty-based static site generator to churn through the files.

simple-httpd lets you run a web server from Emacs. By default, the httpd-root is ~/public_html and httpd-port is 8085, but you can configure it to be somewhere else. Here I set it up to create a new temporary directory, and to delete that directory afterwards.

(use-package simple-httpd
  :config
  (setq httpd-root (make-temp-file "httpd" t))
  :hook
  (httpd-stop . my-simple-httpd-remove-temporary-root)
  (kill-emacs . httpd-stop))

(defun my-simple-httpd-remove-temporary-root ()
  "Remove `httpd-root' only if it's a temporary directory."
  (when (file-in-directory-p httpd-root temporary-file-directory)
    (delete-directory httpd-root t)))

The following code exports your Org buffer or subtree to a file in that directory, copies all the referenced local files (if they're newer) and updates the links in the HTML, and then serves it via simple-httpd. Note that it just overwrites everything without confirmation, so if you refer to files with the same name, only the last one will be kept.

(with-eval-after-load 'ox
  (org-export-define-derived-backend 'my-html-served 'html
    :menu-entry
    '(?s "Export to HTML and Serve"
         ((?b "Buffer"  my-org-serve--buffer)
          (?s "Subtree" my-org-serve--subtree)))))

(defun my-org-serve--buffer (&optional async _subtreep visible-only body-only ext-plist)
  (my-org-export-and-serve nil))

(defun my-org-serve--subtree (&optional async _subtreep visible-only body-only ext-plist)
  (my-org-export-and-serve t))

;; Based on org-11ty--copy-files-and-replace-links
;; Might be a good idea to use something DOM-based instead
(defun my-html-copy-files-and-replace-links (info &optional destination-dir)
  (let ((file-regexp "\\(?:src\\|href\\|poster\\)=\"\\(\\(file:\\)?.*?\\)\"")
        (destination-dir (or destination-dir (file-name-directory (plist-get info :file-path))))
        file-all-urls file-name beg
        new-file file-re
        unescaped)
    (unless (file-directory-p destination-dir)
      (make-directory destination-dir t))
    (unless (file-directory-p destination-dir)
      (error "%s is not a directory." destination-dir))
    (save-excursion
      (goto-char (point-min))
      (while (re-search-forward file-regexp nil t)
        (setq file-name (or (match-string 1) (match-string 2)))
        (unless (or (string-match "^#" file-name)
                    (get-text-property 0 'changed file-name))
          (setq file-name
                (replace-regexp-in-string
                 "\\?.+" ""
                 (save-match-data (if (string-match "^file:" file-name)
                                      (substring file-name 7)
                                    file-name))))
          (setq unescaped
                (replace-regexp-in-string
                 "%23" "#"
                 file-name))
          (setq new-file (concat
                          (if info (plist-get info :permalink) "")
                          (file-name-nondirectory unescaped)))
          (unless (org-url-p file-name)
            (let ((new-file-name (expand-file-name (file-name-nondirectory unescaped)
                                                   destination-dir)))
              (condition-case err
                  (when (or (not (file-exists-p new-file-name))
                            (file-newer-than-file-p unescaped new-file-name))
                    (copy-file unescaped new-file-name t))
                (error nil))
              (when (file-exists-p new-file-name)
                (save-excursion
                  (goto-char (point-min))
                  (setq file-re (concat "\\(?: src=\"\\| href=\"\\| poster=\"\\)\\(\\(?:file://\\)?" (regexp-quote file-name) "\\)"))
                  (while (re-search-forward file-re nil t)
                    (replace-match
                     (propertize
                      (save-match-data (replace-regexp-in-string "#" "%23" new-file))
                      'changed t)
                     t t nil 1)))))))))))

(defun my-org-export-and-serve (&optional subtreep)
  "Export current org buffer (or subtree if SUBTREEP) to HTML and serve via simple-httpd."
  (interactive "P")
  (require 'simple-httpd)
  (httpd-stop)
  (unless httpd-root (error "Set `httpd-root'."))
  (unless (file-directory-p httpd-root)
    (make-directory httpd-root t))
  (unless (file-directory-p httpd-root)
    (error "%s is not a directory." httpd-root))
  (let* ((out-file (expand-file-name (concat (file-name-base (buffer-file-name)) ".html")
                                     httpd-root))
         (html-file (org-export-to-file 'my-html-served out-file nil subtreep)))
    ;; Copy all the files and rewrite all the links
    (with-temp-file out-file
      (insert-file-contents out-file)
      (my-html-copy-files-and-replace-links
       `(:permalink "/") httpd-root))
    (httpd-start)
    (browse-url (format "http://localhost:%d/%s"
                        httpd-port
                        (file-name-nondirectory html-file)))))

Now I can use C-c C-e (org-export-dispatch), select the subtree with C-s, and use s s to export a subtree to a webserver and have all the media files work. This took 0.46 seconds for my post on pronunciation practice and automatically opens the page in a browser window. In comparison, my 11ty static site generator took 5.18 seconds for a subset of my site (1630 files copied, 214 files generated), and I haven't yet hooked up monitoring it to Emacs, so I have to take an extra step to open the page in the browser when I think it's finished. I think exporting to HTML and serving it with simple-httpd will be much easier for simple cases like this, and then I can export to 11ty once I'm done with the basic checks.

This is part of my Emacs configuration.
View Org source for this post

Comparing pronunciation recordings across time

Posted: - Modified: | french, emacs, org, subed
  • : Added reference audio for the second set.
  • : I added pronunciation segments for the new set of tongue-twisters I got on Mar 13.
  • : I added a column for Feb 20, the first session with the sentences. I also added keyboard shortcuts (1..n) for playing the audio of the row that the mouse is on.

2026-02-20: First set: Maman peint un grand lapin blanc, etc.

My French tutor gave me a list of sentences to help me practise pronunciation.

I can fuzzy-match these with the word timing JSON from WhisperX, like this.

Extract all approximately matching phrases
(subed-record-extract-all-approximately-matching-phrases
   sentences
   "/home/sacha/sync/recordings/2026-02-20-raphael.json"
   "/home/sacha/proj/french/analysis/virelangues/2026-02-20-raphael-script.vtt")
Sentences
  • Maman peint un grand lapin blanc.
  • Un enfant intelligent mange lentement.
  • Le roi croit voir trois noix.
  • Le témoin voit le chemin loin.
  • Moins de foin au loin ce matin.
  • La laine beige sèche près du collège.
  • La croquette sèche dans l'assiette.
  • Elle mène son frère à l'hôtel.
  • Le verre vert est très clair.
  • Elle aimait manger et rêver.
  • Le jeu bleu me plaît peu.
  • Ce neveu veut un jeu.
  • Le feu bleu est dangereux.
  • Le beurre fond dans le cœur chaud.
  • Les fleurs de ma sœur sentent bon.
  • Le hibou sait où il va.
  • L'homme fort mord la pomme.
  • Le sombre col tombe.
  • L'auto saute au trottoir chaud.
  • Le château d'en haut est beau.
  • Le cœur seul pleure doucement.
  • Tu es sûr du futur ?
  • Trois très grands trains traversent trois trop grandes rues.
  • Je veux deux feux bleus, mais la reine préfère la laine beige.
  • Vincent prend un bain en chantant lentement.
  • La mule sûre court plus vite que le loup fou.
  • Luc a bu du jus sous le pont où coule la boue.
  • Le frère de Robert prépare un rare rôti rouge.
  • La mule court autour du mur où hurle le loup.

Then I can use subed-record to manually tweak them, add notes, and so on. I end up with VTT files like 2026-03-06-raphael-script.vtt. I can assemble the snippets for a session into a single audio file, like this:

I wanted to compare my attempts over time, so I wrote some code to use Org Mode and subed-record to build a table with little audio players that I can use both within Emacs and in the exported HTML. This collects just the last attempts for each sentence during a number of my sessions (both with the tutor and on my own). The score is from the Microsoft Azure pronunciation assessment service. I'm not entirely sure about its validity yet, but I thought I'd add it for fun. * indicates where I've added some notes from my tutor, which should be available as a title attribute on hover. (Someday I'll figure out a mobile-friendly way to do that.)

Calling it with my sentences and files
(my-lang-summarize-segments
 sentences
 '(("/home/sacha/proj/french/analysis/virelangues/2026-02-20-raphael-script.vtt" . "Feb 20")
 ;("~/sync/recordings/processed/2026-02-20-raphael-tongue-twisters.vtt" . "Feb 20")
        ("~/sync/recordings/processed/2026-02-22-virelangues-single.vtt" . "Feb 22")
        ("~/proj/french/recordings/2026-02-26-virelangues-script.vtt" . "Feb 26")
        ("~/proj/french/recordings/2026-02-27-virelangues-script.vtt" . "Feb 27")
        ("~/proj/french/recordings/2026-03-03-virelangues.vtt" . "Mar 3")
        ("/home/sacha/sync/recordings/processed/2026-03-03-raphael-reference-script.vtt" . "Mar 3")
        ("~/proj/french/analysis/virelangues/2026-03-06-raphael-script.vtt" . "Mar 6")
        ("~/proj/french/analysis/virelangues/2026-03-12-virelangues-script.vtt" . "Mar 12"))
 "clip"
 #'my-lang-subed-record-get-last-attempt
 #'my-lang-subed-record-cell-info
 t
 )
Feb 20 Feb 22 Feb 26 Feb 27 Mar 3 Mar 3 Mar 6 Mar 12 Text
▶️ 63* ▶️ 96 ▶️ 95 ▶️ 94 ▶️ 83 ▶️ 83* ▶️ 81* ▶️ 88 Maman peint un grand lapin blanc.
▶️ 88* ▶️ 95 ▶️ 99 ▶️ 99 ▶️ 96 ▶️ 89* ▶️ 92* ▶️ 83 Un enfant intelligent mange lentement.
▶️ 84* ▶️ 97 ▶️ 97 ▶️ 96 ▶️ 94 ▶️ 95* ▶️ 98* ▶️ 99 Le roi croit voir trois noix.
▶️ 80* ▶️ 85 ▶️ 77 ▶️ 94 ▶️ 97   ▶️ 92* ▶️ 88 Le témoin voit le chemin loin.
▶️ 72* ▶️ 97 ▶️ 95 ▶️ 77 ▶️ 92   ▶️ 89* ▶️ 86 Moins de foin au loin ce matin.
▶️ 79* ▶️ 95 ▶️ 76 ▶️ 95 ▶️ 76 ▶️ 90* ▶️ 90* ▶️ 79 La laine beige sèche près du collège.
▶️ 67* ▶️ 99 ▶️ 85 ▶️ 81 ▶️ 85 ▶️ 99* ▶️ 97* ▶️ 97 La croquette sèche dans l'assiette.
▶️ 88* ▶️ 99 ▶️ 100 ▶️ 100 ▶️ 98 ▶️ 100* ▶️ 99* ▶️ 100 Elle mène son frère à l'hôtel.
▶️ 77* ▶️ 87 ▶️ 99 ▶️ 93 ▶️ 87   ▶️ 87* ▶️ 99 Le verre vert est très clair.
▶️ 100* ▶️ 94 ▶️ 100 ▶️ 99 ▶️ 99 ▶️ 99* ▶️ 100* ▶️ 100 Elle aimait manger et rêver.
▶️ 78* ▶️ 98 ▶️ 99 ▶️ 98 ▶️ 98 ▶️ 92*   ▶️ 88 Le jeu bleu me plaît peu.
▶️ 78* ▶️ 97 ▶️ 85 ▶️ 95 ▶️ 85     ▶️ 85 Ce neveu veut un jeu.
▶️ 73* ▶️ 95 ▶️ 95 ▶️ 96 ▶️ 97     ▶️ 100 Le feu bleu est dangereux.
▶️ 87* ▶️ 76 ▶️ 65 ▶️ 97 ▶️ 85 ▶️ 74* ▶️ 85* ▶️ 96 Le beurre fond dans le cœur chaud.
▶️ 84* ▶️ 43 ▶️ 85 ▶️ 79 ▶️ 75     ▶️ 98 Les fleurs de ma sœur sentent bon.
▶️ 70* ▶️ 86 ▶️ 79 ▶️ 76 ▶️ 87 ▶️ 84   ▶️ 98 Le hibou sait où il va.
▶️ 92* ▶️ 95 ▶️ 86 ▶️ 92 ▶️ 98 ▶️ 99*   ▶️ 94 L'homme fort mord la pomme.
▶️ 83* ▶️ 73 ▶️ 69 ▶️ 81 ▶️ 60 ▶️ 96*   ▶️ 81 Le sombre col tombe.
▶️ 39* ▶️ 49 ▶️ 69 ▶️ 56 ▶️ 69 ▶️ 96*   ▶️ 94 L'auto saute au trottoir chaud.
▶️ 82 ▶️ 84 ▶️ 85 ▶️ 98 ▶️ 94 ▶️ 96*   ▶️ 99 Le château d'en haut est beau.
▶️ 89 ▶️ 85 ▶️ 75 ▶️ 91 ▶️ 52 ▶️ 75* ▶️ 70* ▶️ 98 Le cœur seul pleure doucement.
▶️ 98*   ▶️ 99 ▶️ 99 ▶️ 95 ▶️ 93* ▶️ 97* ▶️ 99 Tu es sûr du futur ?
    ▶️ 97 ▶️ 93 ▶️ 92 ▶️ 85*   ▶️ 90 Trois très grands trains traversent trois trop grandes rues.
    ▶️ 94 ▶️ 85 ▶️ 97 ▶️ 82*   ▶️ 92 Je veux deux feux bleus, mais la reine préfère la laine beige.
    ▶️ 91 ▶️ 79 ▶️ 87 ▶️ 82*   ▶️ 94 Vincent prend un bain en chantant lentement.
    ▶️ 89 ▶️ 91 ▶️ 91 ▶️ 84*   ▶️ 92 La mule sûre court plus vite que le loup fou.
    ▶️ 91 ▶️ 93 ▶️ 93 ▶️ 92*   ▶️ 96 Luc a bu du jus sous le pont où coule la boue.
    ▶️ 88 ▶️ 71 ▶️ 94 ▶️ 86*   ▶️ 92 Le frère de Robert prépare un rare rôti rouge.
    ▶️ 81 ▶️ 84 ▶️ 88 ▶️ 67*   ▶️ 94 La mule court autour du mur où hurle le loup.

Pronunciation still feels a bit hit or miss. Sometimes I say a sentence and my tutor says "Oui," and then I say it again and he says "Non, non…" The /ʁ/ and /y/ sounds are hard.

I like seeing these compact links in an Org Mode table and being able to play them, thanks to my custom audio link type. It should be pretty easy to write a function that lets me use a keyboard shortcut to play the audio (maybe using the keys 1-9?) so that I can bounce between them for comparison.

If I screen-share from Google Chrome, I can share the tab with audio, so my tutor can listen to things at the same time. Could be fun to compare attempts so that I can try to hear the differences better. Hmm, actually, let's try adding keyboard shortcuts that let me use 1-8, n/p, and f/b to navigate and play audio. Mwahahaha! It works!

2026-03-14: Second set: Mon oncle peint un grand pont blanc, etc.

Update 2026-03-14: My tutor gave me a new set of tongue-twisters. When I'm working on my own, I find it helpful to loop over an audio reference with a bit of silence after it so that I can repeat what I've heard. I have several choices for reference audio:

  • I can generate an audio file using text-to-speech, like a local instance of Kokoro TTS, or a hosted service like Google Translate (via gtts-cli), ElevenLabs, or Microsoft Azure.
  • I can extract a recording of my tutor from one of my sessions.
  • I can extract a recording of myself from one of my tutoring sessions where my tutor said that the pronunciation is alright.

Here I stumble through the tongue-twisters. I've included reference audio from Kokoro, gtts, and ElevenLabs for comparison.

(my-subed-record-analyze-file-with-azure
 (subed-record-keep-last
  (subed-record-filter-skips
   (subed-parse-file
    "/home/sacha/proj/french/analysis/virelangues/2026-03-13-raphael-script.vtt")))
 "~/proj/french/analysis/virelangues-2026-03-13/2026-03-13-clip")
Kk Gt Az Me ID Comments All Acc Flu Comp Conf  
👂🏼 👂🏼 👂🏼 ▶️ 1 X: pont 93 99 90 100 86 Mon oncle peint un grand pont blanc. {pont}
👂🏼 👂🏼 👂🏼 ▶️ 2 C'est mieux 68 75 80 62 87 Un singe malin prend un bon raisin rond.
👂🏼 👂🏼 👂🏼 ▶️ 3 Ouais, c'est ça 83 94 78 91 89 Dans le vent du matin, mon chien sent un bon parfum.
👂🏼 👂🏼 👂🏼 ▶️ 4 ok 75 86 63 100 89 Le soin du roi consiste à joindre chaque coin du royaume.
👂🏼 👂🏼 👂🏼 ▶️ 5 Ouais, c'est ça, parfait 83 94 74 100 88 Dans un coin du bois, le roi voit trois points noirs.
👂🏼 👂🏼 👂🏼 ▶️ 6 Ouais, parfait 90 92 87 100 86 Le feu de ce vieux four chauffe peu.
👂🏼 👂🏼 👂🏼 ▶️ 7 Ouais 77 85 88 71 86 Deux peureux veulent un peu de feu.
👂🏼 👂🏼 👂🏼 ▶️ 8   77 78 75 83 85 Deux vieux bœufs veulent du beurre.
👂🏼 👂🏼 👂🏼 ▶️ 9 Ouais, parfait 92 94 89 100 89 Elle aimait marcher près de la rivière.
👂🏼 👂🏼 👂🏼 ▶️ 10 Ok, c'est bien 93 98 89 100 90 Je vais essayer de réparer la fenêtre.
👂🏼 👂🏼 👂🏼 ▶️ 11 Okay 83 87 76 100 89 Le bébé préfère le lait frais.
👂🏼 👂🏼 👂🏼 ▶️ 12   77 92 70 86 90 Charlotte cherche ses chaussures dans la chambre.
👂🏼 👂🏼 👂🏼 ▶️ 13 Okay 91 90 94 91 88 Un chasseur sachant chasser sans son chien est-il un bon chasseur ?
👂🏼 👂🏼 👂🏼 ▶️ 14 Ouais 91 88 92 100 91 Le journaliste voyage en janvier au Japon.
👂🏼 👂🏼 👂🏼 ▶️ 15 C'est bien (X: dans un) 91 88 94 100 88 Georges joue du jazz dans un grand bar. {dans un}
👂🏼 👂🏼 👂🏼 ▶️ 16 C'est bien 88 87 94 88 85 Un jeune joueur joue dans le grand gymnase.
👂🏼 👂🏼 👂🏼 ▶️ 17   95 94 96 100 91 Le compagnon du montagnard soigne un agneau.
👂🏼 👂🏼 👂🏼 ▶️ 18   85 88 84 86 89 La cigogne soigne l’agneau dans la campagne.
👂🏼 👂🏼 👂🏼 ▶️ 19 grenouille 71 80 68 75 86 La grenouille fouille les feuilles dans la broussaille.

The code

Code for summarizing the segments
(defun my-lang-subed-record-cell-info (item file-index file sub)
  (let* ((sound-file (expand-file-name (format "%s-%s-%d.opus"
                                               prefix
                                               (my-transform-html-slugify item)
                                               (1+ file-index))))
         (score (car (split-string
                      (or
                       (subed-record-get-directive "#+SCORE" (elt sub 4)) "")
                      ";")))
         (note (replace-regexp-in-string
                (concat "^" (regexp-quote (cdr file))
                        "\\(: \\)?")
                ""
                (or (subed-record-get-directive "#+NOTE" (elt sub 4)) ""))))
    (when (or always-create (not (file-exists-p sound-file)))
      (subed-record-extract-audio-for-current-subtitle-to-file sound-file sub))
    (org-link-make-string
     (concat "audio:" sound-file "?icon=t"
             (format "&source=%s&source-start=%s" (car file) (elt sub 1))
             (format "&title=%s"
                     (url-hexify-string
                      (if (string= note "")
                          (cdr file)
                        (concat (cdr file) ": " note)))))
     (concat
      "▶️"
      (if score (format " %s" score) "")
      (if (string= note "") "" "*")))))

(defun my-lang-subed-record-get-last-attempt (item file)
  "Return the last subtitle matching ITEM in FILE."
  (car
   (last
    (seq-remove
     (lambda (o) (string-match "#\\+SKIP" (or (elt o 4) "")))
     (learn-lang-subed-record-collect-matching-subtitles
      item
      (list file)
      nil
      nil
      'my-subed-simplify)))))

(defun my-lang-summarize-segments (items files prefix attempt-fn cell-fn &optional always-create)
  (cons
   (append
    (seq-map 'cdr files)
    (list "Text"))
   (seq-map
    (lambda (item)
      (append
       (seq-map-indexed
        (lambda (file file-index)
          (let* ((sub (funcall attempt-fn item file)))
            (if sub
                (funcall cell-fn item file-index file sub)
              "")))
        files)
       (list item)))
    items)))

(defun my-subed-record-analyze-file-with-azure (subtitles prefix &optional always-create)
  (cons
   '("Kk" "Gt" "Az" "Me" "ID" "Comments" "All" "Acc" "Flu" "Comp" "Conf")
   (seq-map-indexed
    (lambda (sub i)
      (let ((sound-file (expand-file-name (format "%s-%02d.opus"
                                                  prefix
                                                  (1+ i))))
            (tts-services
             '(("kokoro" . learn-lang-tts-kokoro-fastapi-say)
               ("gtts" . learn-lang-tts-gtts-say)
               ("azure" . learn-lang-tts-azure-say)))
            tts-files
            (note (subed-record-get-directive "#+NOTE" (elt sub 4))))
        (when (or always-create (not (file-exists-p sound-file)))
          (subed-record-extract-audio-for-current-subtitle-to-file sound-file sub))
        (setq
         tts-files
         (mapcar
          (lambda (row)
            (let ((reference (format "%s-%s-%02d.opus" prefix (car row) (1+ i) )))
              (when (or always-create (not (file-exists-p reference)))
                (funcall (cdr row)
                         (subed-record-simplify (elt sub 3))
                         'sync
                         reference))
              (org-link-make-string
               (concat "audio:" reference "?icon=t&note=" (url-hexify-string (car row)))
               "👂🏼")))
          tts-services))
        (append
         tts-files
         (list
          (org-link-make-string
           (concat "audio:" sound-file "?icon=t"
                   (format "&source-start=%s" (elt sub 1))
                   (if (and note (not (string= note "")))
                       (format "&title=%s"
                               (url-hexify-string note))
                     ""))
           "▶️")
          (format "%d" (1+ i))
          (or note ""))
         (learn-lang-azure-subed-record-parse (elt sub 4))
         (list
          (elt sub 3)))))
    subtitles)))

Some code for doing this stuff is in sachac/learn-lang on Codeberg.

View Org source for this post

Small steps towards using OpenAI-compatible text-to-speech services with speechd-el or emacspeak

| emacs

Speech synthesis has come a long way since I first tried out Emacspeak in 2002. Kokoro TTS and Piper offer more natural-sounding voices now, although the initial delay in loading the models and generating speech mean that they aren't quite ready to completely replace espeak, which is faster but more robotic. I've been using the Kokoro FastAPI through my own functions for working with various speech systems. I wanted to see if I could get Kokoro and other OpenAI-compatible text-to-speech services to work with either speechd-el or Emacspeak just in case I could take advantage of the rich functionality either provides for speech-synthesized Emacs use. speechd-el is easier to layer on top of an existing Emacs if you only want occasional speech, while emacspeak voice-enables many packages to an extent beyond speaking simply what's on the screen.

Speech synthesis is particularly helpful when I'm learning French because I can use it as a reference for what a paragraph or sentence should sound like. It's not perfect. Sometimes it uses liaisons that my tutor and Google Translate don't use. But it's a decent enough starting point. I also used it before to read out IRC mentions and compile notifications so that I could hear them even if I was paying attention to a different activity.

Here's a demonstration of speechd reading out the following lines using the code I've just uploaded to https://codeberg.org/sachac/speechd-ai:

  • The quick brown fox jumps over the lazy dog.
  • Now let's set the language to French so we can read the next line.
  • Bonjour, je m'appelle Emacs.

Screencast showing speechd-el

There's about a 2-second delay between the command and the start of the audio for the sentence.

Note that speechd-speak-read-sentence fails in some cases where (forward-sentence 1) isn't the same place as (backward-sentence 1) (forward-sentence 1), which can happen when you're in an Org Mode list. I've submitted a patch upstream.

Aside from that, speechd-speak-set-language, speechd-speak-read-paragraph and speechd-speak-read-region are also useful commands. I think the latency makes this best-suited for reading paragraphs, or for shadowing sentences for language learning.

I'm still trying to figure out how to get speechd-speak to work as smoothly as I'd like. I think I've got it set up so that the server falls back to espeak for short texts so that it can handle words or characters better, and uses the specified server for longer ones. I'd like to get to the point where it can handle all the things that speechd usually does, like saying lines as I navigate through them or giving me feedback as I'm typing. Maybe it can use espeak for fast feedback character by character and word by word, and then use Kokoro TTS for the full sentence when I finish. Then it will be possible to use it to type things without looking at the screen.

After putting this together, I still find myself leaning towards my own functions because they make it easy to see the generated speech output to a file, which is handy for saving reference audio that I can play on my phone and for making replays almost instant. That could also be useful for pre-generating the next paragraph to make it flow more smoothly. Still, it was interesting making something that is compatible with existing protocols and libraries.

Posting it in case anyone else wants to use it as a starting point. The repository also contains the starting point for an Emacspeak-compatible speech server. See See speechd-ai/README.org for more details.

https://codeberg.org/sachac/speechd-ai

View Org source for this post

Emacs Lisp and NodeJS: Getting the bolded words from a section of a Google Document

Posted: - Modified: | french, js, emacs
  • : Cleaned up links from Google
  • : Simplified getting a section or finding the bolded text by using the Org Mode format instead.

During the sessions with my French tutor, I share a Google document so that we can mark the words where I need to practice my pronunciation some more or tweak the wording. Using Ctrl+B to make the word as bold is an easy way to make it jump out.

I used to copy these changes into my Org Mode notes manually, but today I thought I'd try automating some of it.

First, I need a script to download the HTML for a specified Google document. This is probably easier to do with the NodeJS library rather than with oauth2.el and url-retrieve-synchronously because of various authentication things.

require('dotenv').config();
const { google } = require('googleapis');

async function download(fileId) {
  const auth = new google.auth.GoogleAuth({
    scopes: ['https://www.googleapis.com/auth/drive.readonly'],
  });
  const drive = google.drive({ version: 'v3', auth });
  const htmlRes = await drive.files.export({
    fileId: fileId,
    mimeType: 'text/html'
  });
  return htmlRes.data;
}

async function main() {
  console.log(await download(process.argv.length > 2 ? process.argv[2] : process.env['DOC_ID']));
}

main();

Then I can wrap a little bit of Emacs Lisp around it.

(defvar my-google-doc-download-command
  (list "nodejs" (expand-file-name "~/bin/download-google-doc-html.cjs")))

(defun my-google-doc-html (doc-id)
  (when (string-match "https://docs\\.google\\.com/document/d/\\(.+?\\)/" doc-id)
    (setq doc-id (match-string 1 doc-id)))
  (with-temp-buffer
    (apply #'call-process (car my-google-doc-download-command)
           nil t nil (append (cdr my-google-doc-download-command) (list doc-id)))
    (buffer-string)))

(defun my-google-doc-clean-html (html)
  "Remove links on spaces, replace Google links."
  (let ((dom (with-temp-buffer
               (insert html)
               (libxml-parse-html-region))))
    (dom-search
     dom
     (lambda (o)
       (when (eq (dom-tag o) 'a)
         (when (and (dom-attr o 'href)
                    (string-match "https://\\(www\\.\\)?google\\.com/url\\?q=" (dom-attr o 'href)))
           (let* ((parsed (url-path-and-query
                           (url-generic-parse-url (dom-attr o 'href))))
                  (params (url-parse-query-string (cdr parsed))))
             (dom-set-attribute o 'href (car (assoc-default "q" params #'string=)))))
         (let ((text (string= (string-trim (dom-text o)) "")))
           (when (string= text "")
             (setf (car o) 'span))))
       (when (and
              (string-match "font-weight:700" (or (dom-attr o 'style) ""))
              (not (string-match "font-style:normal" (or (dom-attr o 'style) ""))))
         (setf (car o) 'strong))
       (when (dom-attr o 'style)
         (dom-remove-attribute o 'style))))
    ;; bold text is actually represented as font-weight:700 instead
    (with-temp-buffer
      (svg-print dom)
      (buffer-string))))

(defun my-google-doc-org (doc-id)
  "Return DOC-ID in Org Mode format."
  (pandoc-convert-stdio (my-google-doc-clean-html (my-google-doc-html doc-id)) "html" "org"))

I have lots of sections in that document, including past journal entries, so I want to get a specific section by name.

(defun my-org-get-subtree-by-name (org-text heading-name)
  "Return ORG-TEXT subtree for HEADING-NAME."
  (with-temp-buffer
    (insert org-text)
    (org-mode)
    (goto-char (point-min))
    (let ((org-trust-scanner-tags t))
      (car (delq nil
                 (org-map-entries
                  (lambda ()
                    (when (string= (org-entry-get (point) "ITEM") heading-name)
                      (buffer-substring (point) (org-end-of-subtree))))))))))

Now I can get the bolded words from a section of my notes, with just a sentence for context. I use pandoc to convert it to Org Mode syntax.

(defvar my-lang-words-for-review-context-function 'sentence-at-point)
(defvar my-lang-tutor-notes-url nil)
(defun my-lang-tutor-notes (section-name)
  (my-org-get-subtree-by-name
   (my-google-doc-org my-lang-tutor-notes-url)
   section-name))

(defun my-lang-words-for-review (section)
  "List the bolded words for review in SECTION."
  (let* ((section (my-lang-tutor-notes section))
         results)
    (with-temp-buffer
      (insert section)
      (org-mode)
      (goto-char (point-min))
      (org-map-entries
       (lambda ()
         (org-end-of-meta-data t)
         (while (re-search-forward "\\*[^* ].*?\\*" nil t)
           (cl-pushnew
            (replace-regexp-in-string
             "[ \n ]+" " "
             (funcall my-lang-words-for-review-context-function))
            results
            :test 'string=)))))
    (nreverse results)))

For example, when I run it on my notes on artificial intelligence, this is the list of bolded words and the sentences that contain them.

(my-lang-words-for-review "Sur l'intelligence artificielle")
  • Je l'ai aussi utilisée pour faire des recherches.
  • Je peux consacrer une petite partie de mon budget à des essais, mais je ne veux pas travailler davantage pour rentabiliser une dépense plus importante.
  • Je n'ai pas le temps de concentration nécessaire pour justifier l'investissement dans mon propre matériel, et sinon, les progrès sont trop rapides pour m'engager dans une configuration spécifique.
  • J'ai une conscience aiguë des limites cognitives ou physiques à cause des difficultés de santé de ma mère et de ma sœur, et de mes expériences avec mes limitations à cause du fait que je suis la personne principalement en charge de ma fille.
  • Je lis très vite, mais je n'ai pas assez de patience pour les longs contenus vidéo ou audio.
  • Je n'aime pas les textes qui contiennent beaucoup de remplissage.
  • Beaucoup de gens ont une réaction forte contre l'IA pour plusieurs raisons qui incluent le battage médiatique excessif dont elle fait l'objet, son utilisation à mauvais escient, et l'inondation de banalité qu'elle produit.
  • Je réécris souvent la majorité du logiciel à l'exception d'un ou deux morceaux parce que ce code ne me convient pas.
  • Je ne veux pas l'utiliser pour les correctifs que je veux soumettre à d'autres projets parce que le code ne me semble pas correct et je ne veux pas gaspiller le temps d'autres bénévoles.
  • J'aime pouvoir lui donner trois dépôts git et des instructions pour générer un logiciel à partir d'un dépôt pour un autre via le troisième dépôt.
  • Mais je ne veux pas le publier avant de réécrire et tout comprendre.
  • Sans l'IA, je pourrais peut-être apprendre plus lentement avec l'aide d'Internet, qui a beaucoup de ressources commehttps://vitrinelinguistique.oqlf.gouv.qc.ca/Vitrine linguistique.
  • Je veux profiter davantage, apprendre davantage avec l'aide de vraies personnes, complétée par l'aide de l'IA.
  • J'adore les sous-titres simultanés, mais je n'ai pas toujours trouvé une méthode ou un système qui me convienne.

I can then go into the WhisperX transcription JSON file and replay those parts for closer review.

I can also tweak the context function to give me less information. For example, to limit it to the containing phrase, I can do this:

(defun my-split-string-keep-delimiters (string delimiter)
  (when string
    (let (results pos)
      (with-temp-buffer
        (insert string)
        (goto-char (point-min))
        (setq pos (point-min))
        (while (re-search-forward delimiter nil t)
          (push (buffer-substring pos (match-beginning 0)) results)
          (setq pos (match-beginning 0)))
        (push (buffer-substring pos (point-max)) results)
        (nreverse results)))))

(ert-deftest my-split-string-keep-delimiters ()
 (should
  (equal (my-split-string-keep-delimiters
          "Beaucoup de gens ont une réaction forte contre l'IA pour plusieurs raisons qui *incluent* le battage médiatique excessif dont elle fait l'objet, son utilisation à mauvais escient, et *l'inondation de banalité* qu'elle produit."
          ", \\| que \\| qui \\| qu'ils? \\| qu'elles? \\| qu'on "
          )
 )))

(defun my-lang-words-for-review-phrase-context (&optional s)
  (setq s (replace-regexp-in-string " " " " (or s (sentence-at-point))))
  (string-join
   (seq-filter (lambda (s) (string-match "\\*" s))
               (my-split-string-keep-delimiters s ", \\| parce que \\| que \\| qui \\| qu'ils? \\| qu'elles? \\| qu'on \\| pour "))
   " ... "))

(ert-deftest my-lang-words-for-review-phrase-context ()
  (should
   (equal (my-lang-words-for-review-phrase-context
           "Je peux consacrer une petite partie de mon *budget* à des essais, mais je ne veux pas travailler davantage pour rentabiliser une dépense plus importante.")
          "Je peux consacrer une petite partie de mon *budget* à des essais")))
(let ((my-lang-words-for-review-context-function 'my-lang-words-for-review-phrase-context))
  (my-lang-words-for-review "Sur l'intelligence artificielle"))
  • pour faire des recherches.
  • Je peux consacrer une petite partie de mon budget à des essais
  • , et sinon
  • J'ai une conscience aiguë des limites cognitives ou physiques à cause des difficultés de santé de ma mère et de ma sœur
  • pour les longs contenus vidéo ou audio.
  • Je n'aime pas les textes qui contiennent beaucoup de remplissage.
  • qui incluent le battage médiatique excessif dont elle fait l'objet … , et l'inondation de banalité
  • Je réécris souvent la majorité du logiciel à l'exception d'un ou deux morceaux
  • pour les correctifs … parce que le code ne me semble pas correct et je ne veux pas gaspiller le temps d'autres bénévoles.
  • pour un autre via le troisième dépôt.
  • Mais je ne veux pas le publier avant de réécrire et tout comprendre.
  • , je pourrais peut-être apprendre plus lentement avec l'aide d'Internet
  • , apprendre davantage avec l'aide de vraies personnes, complétée par l'aide de l'IA.
  • qui me convienne.

Now that I have a function for retrieving the HTML or Org Mode for a section, I can use that to wdiff against my current text to more easily spot wording changes.

(defun my-lang-tutor-notes-wdiff-org ()
  (interactive)
  (let ((section (org-entry-get (point) "ITEM")))
    (my-wdiff-strings
     (replace-regexp-in-string
      " " " "
      (my-org-subtree-text-without-blocks))
     (replace-regexp-in-string
      " " " "
      (my-lang-tutor-notes section)))))

Related:

Screenshot:

2026-03-12_11-28-24.png
Figure 1: wdiff
This is part of my Emacs configuration.
View Org source for this post

2026-03-09 Emacs news

| emacs, emacs-news

If you use kubernetes-el, don't update for now, and you might want to check your installation if you updated it recently. The repo was compromised a few days ago.

I've occasionally wanted to tangle a single Org Mode source block to multiple places, so I'm glad to hear that ob-tangle has just added support for multiple targets. Niche, but could be handy. I'm also curious about using clime to write command-line tools in Emacs Lisp that handle argument parsing and all the usual stuff.

If you're looking for something to write about, why not try this month's Emacs Carnival theme of mistakes and misconceptions?

Enjoy!

Links from reddit.com/r/emacs, r/orgmode, r/spacemacs, Mastodon #emacs, Bluesky #emacs, Hacker News, lobste.rs, programming.dev, lemmy.world, lemmy.ml, planet.emacslife.com, YouTube, the Emacs NEWS file, Emacs Calendar, and emacs-devel. Thanks to Andrés Ramírez for emacs-devel links. Do you have an Emacs-related link or announcement? Please e-mail me at [email protected]. Thank you!

View Org source for this post

Expanding yasnippets by voice in Emacs and other applications

| emacs, audio, speech-recognition

Yasnippet is a template system for Emacs. I want to use it by voice. I'd like to be able to say things like "Okay, define interactive function" and have that expand to a matching snippet in Emacs or other applications. Here's a quick demonstration of expanding simple snippets:

Screencast of expanding snippets by voice in Emacs and in other applications

Transcript
  • 00:00 So I've defined some yasnippets with names that I can say. Here, for example, in this menu, you can see I've got "define interactive function" and "with a buffer that I'll display." And in fundamental mode, I have some other things too. Let's give it a try.
  • 00:19 I press my shortcut. "Okay, define an interactive function." You can see that this is a yasnippet. Tab navigation still works.
  • 00:33 I can say, "OK, with a buffer that I'll display," and it expands that also.
  • 00:45 I can expand snippets in other applications as well, thanks to a global keyboard shortcut.
  • 00:50 Here, for example, I can say, "OK, my email." It inserts my email address.
  • 01:02 Yasnippet definitions can also execute Emacs Lisp. So I can say, "OK, date today," and have that evaluated to the actual date.
  • 01:21 So that's an example of using voice to expand snippets.

This is handled by the following code:

(defun my-whisper-maybe-expand-snippet (text)
  "Add to `whisper-insert-text-at-point'."
  (if (and text
           (string-match
            "^ok\\(?:ay\\)?[,\\.]? \\(.+\\)" text))
    (let* ((name
            (downcase
             (string-trim
              (replace-regexp-in-string "[,\\.]" "" (match-string 1 text)))))
           (matching
            (seq-find (lambda (o)
                        (subed-word-data-compare-normalized-string-distance
                         name
                         (downcase (yas--template-name o))))
                      (yas--all-templates (yas--get-snippet-tables)))))
      (if matching
          (progn
            (if (frame-focus-state)
                (progn
                  (yas-expand-snippet matching)
                  nil)
              ;; In another application
              (with-temp-buffer
                (yas-minor-mode)
                (yas-expand-snippet matching)
                (buffer-string))))
        text))
    text))

This code relies on my fork of whisper.el, which lets me specify a list of functions for whisper-insert-text-at-point. (I haven't asked for upstream review yet because I'm still testing things, and I don't know if it actually works for anyone else yet.) It does approximate matching on the snippet name using a function from subed-word-data.el which just uses string-distance. I could probably duplicate the function in my config, but then I'd have to update it in two places if I come up with more ideas.

The code for inserting into other functions is defined in my-whisper-maybe-type, which is very simple:

(defun my-whisper-maybe-type (text)
  "If Emacs is not the focused app, simulate typing TEXT.
Add this function to `whisper-insert-text-at-point'."
  (when text
    (if (frame-focus-state)
        text
      (make-process :name "xdotool" :command
                    (list "xdotool" "type"
                          text))
      nil)))

Someday I'd like to provide alternative names for snippets. I also want to make it easy to fill in snippet fields by voice. I'd love to be able to answer minibuffer questions from yas-choose-value, yas-completing-read, and other functions by voice too. Could be fun!

Related:

This is part of my Emacs configuration.
View Org source for this post

Emacs Lisp: defvar-keymap hints for which-key

| emacs

Emacs has far too many keyboard shortcuts for me to remember, so I use which-key to show me a menu if I pause for too long and which-key-posframe to put it somewhere close to my cursor.

(use-package which-key :init (which-key-mode 1))
(use-package which-key-posframe :init (which-key-posframe-mode 1))

I've used which-key-replacement-alist to rewrite the function names and re-sort the order to make them a little easier to scan, but that doesn't cover the case where you've defined an anonymous function ((lambda ...)) for those quick one-off commands. It just displays "function".

Pedro A. Aranda Gutiérrez wanted to share this tip about defining hints by using cons. Here's his example:

(defun insert-surround (opening &optional closing)
 "Insert OPENING and CLOSING and place the cursor before CLOSING.

Default CLOSING is \"}\"."
  (insert opening)
  (save-excursion
    (insert (or closing "}"))))

(defvar-keymap tex-format-map
  :doc "My keymap for text formatting"
  "-"  (cons "under" (lambda() (interactive) (insert-surround
"\\underline{")))
  "b"  (cons "bold"  (lambda() (interactive) (insert-surround "\\textbf{")))
  "e"  (cons "emph"  (lambda() (interactive) (insert-surround "\\emph{")))
  "i"  (cons "ital"  (lambda() (interactive) (insert-surround "\\textit{")))
  "m"  (cons "math"  (lambda() (interactive) (insert-surround "$" "$")))
  "s"  (cons "sans"  (lambda() (interactive) (insert-surround "\\textsf{")))
  "t"  (cons "tty" (lambda() (interactive) (insert-surround "\\texttt{")))
  "v"  (cons "Verb"  (lambda() (interactive) (insert-surround "\\Verb{")))
  "S"  (cons "small" (lambda() (interactive) (insert-surround "\\small{"))))
(fset 'tex-format-map tex-format-map)

Let's try it out:

(with-eval-after-load 'tex-mode
  (keymap-set tex-mode-map "C-c t" 'tex-format-map))
2026-03-02_14-45-51.png
Figure 1: Screenshot with hints

This works for named functions as well. Here's how I've updated my config:

(defvar-keymap my-french-map
  "l" (cons "🔍 lookup" #'my-french-lexique-complete-word)
  "w" (cons "📚 wordref" #'my-french-wordreference-lookup)
  "c" (cons "✏️ conj" #'my-french-conjugate)
  "f" (cons "🇫🇷 fr" #'my-french-consult-en-fr)
  "s" (cons "🗨️ say" #'my-french-say-word-at-point)
  "t" (cons "🇬🇧 en" #'my-french-translate-dwim))
(fset 'my-french-map my-french-map)

(with-eval-after-load 'org
  (keymap-set org-mode-map "C-," 'my-french-map)
  (keymap-set org-mode-map "C-c u" 'my-french-map))
2026-03-02_13-57-23.png
Figure 2: Before: Without the cons, which-key uses the full function name
2026-03-02_14-42-42.png
Figure 3: After: Might be easier to skim?

In case you're adding to an existing keymap, you can use keymap-set with cons.

(keymap-set my-french-map "S" (cons "sentence" #'my-french-say-sentence-at-point))

This is also different from the :hints that show up in the minibuffer when you have a repeating map. Those are defined like this:

(defvar-keymap my-french-map
  :repeat (:hints ((my-french-lexique-complete-word . "lookup")
                   (my-french-consult-en-fr . "fr")
                   (my-french-translate-dwim . "en")
                   (my-french-say-word-at-point . "say")))
  "l" (cons "🔍 lookup" #'my-french-lexique-complete-word)
  "w" (cons "📚 wordref" #'my-french-wordreference-lookup)
  "c" (cons "✏️ conj" #'my-french-conjugate)
  "f" (cons "🇫🇷 fr" #'my-french-consult-en-fr)
  "s" (cons "🗨️ say" #'my-french-say-word-at-point)
  "t" (cons "🇬🇧 en" #'my-french-translate-dwim))

and those appear in the minibuffer like this:

2026-03-02_13-59-42.png
Figure 4: Minibuffer repeat hints

Menus in Emacs are also keymaps, but the labels work differently. These ones are defined with easymenu.

(with-eval-after-load 'org
(define-key-after
 org-mode-map
 [menu-bar french-menu]
 (cons "French"
       (easy-menu-create-menu
        "French"
        '(["🕮Grammalecte" my-flycheck-grammalecte-setup t]
          ["✓Gptel" my-lang-gptel-flycheck-setup t]
          ["🎤Subed-record" my-french-prepare-subed-record t])))
 'org))

Using your own hints is like leaving little breadcrumbs for yourself.

Thanks to Pedro for the tip!

View Org source for this post