Bendangelo Blog

LLM astroturfing is killing Reddit

2026-02-02T11:07:00-05:00

Example of a llm astroturfing tool

Do you want to know why reading Reddit feels so infuriating to read recently? So many posts always use this same lifeless tone, with bullet pointed lists and lack any sort of authenticity. They are so generic and I feel like I’m slowly mentally degenerating by just reading it day after day.

This is the result of Google, and other companies using Reddit comments as input into their AI systems.

TLDR:

ChatGPT uses Google to search for answers. Reddit threads often appear as top search results from Google. Reddit threads and posts are being created by AI to mention their products / services. The hope is to get their product more mentions by other AIs via this method.

AI finds Reddit threads

Basically:

These marketing companies created an AI to find which Reddit threads are going viral and/or predict which ones will
Then they use an LLM to give a long human-like reply
Then near the end mention all the services / products they should use
Linking to them isn’t even needed because the goal is to get them mentioned by ChatGPT and other AIs

The ironic part of this scheme is that these comments are not even meant to be read by humans, they are information dense and optimized to be read by AI. Then in the next AI training or whatever this product mention will be picked up. It’s simply an overcomplicated way of automating advertising.

It’s like a snake eating it’s own tail

AIs using Reddit to mention their product and then it gets sucked into the next training data set.

AI generates fake questions in threads

I’ve even seen some AIs generating fake generic questions and then using an other account to answer it.

Example LLM generated thread

For those who don’t want to click the link:

r/canadasmallbusiness

Anyone else struggling to keep their small business afloat lately?

Running a small business in Canada feels tougher than ever right now. Rising costs, low reach, and constantly changing platforms make it hard just to stay visible. ….

Generic problem, and a generic explanation. Seriously does anyone actually talk like this? How can anyone give advice to such a open-ended question? But of course this is the perfect question for an AI to solve, they love this generic topic, it’s just full of juicy keywords.

Then AI answers itself

The original AI generated response was removed. But it basically gave a long winded answer about having the same feeling, and then near the end mentioned various internet services it can use. And helpfully layed out which ones are good for what. Everything said in a way for the next LLM to slurp up.

This was bound to happen

Reddit astroturfing isn’t new but now the end result is people will not see the source of the answer on sites like Reddit, they’ll only get an answer from their AI. AI is very good at convincing people and most won’t bother to check the source for the claims. For example, Doctors are using LLMs to ask for which medicine to prescribe. Check this comment out:

I worked at a company that sold to doctors and the marketing team was spending most of their effort on this kind of thing. They said that nowadays when doctors want to know “what should I buy to solve X?” or “which is better A or B?” they ask ChatGPT and take its answer as factual. They said that they were very successful in generating blog articles for OpenAI to train on so that our product would be the preferred answer.

Source thread

People will assume because the AI crunched the numbers and thought really hard, its answer must be the most optimal, but in actuality, it’s just regurgitating what was said on Reddit and/or blog posts.

I want to talk to real people

Reading Reddit is getting harder by the week. The remedy seems to be that the AI systems will implement a sort of “adblocking” system to stop this from happening. Only then will they stop using this method.

Optimizing your site for LLMs

2026-01-21T22:45:00-05:00

My llms.txt

LLM optimization is becoming more and more needed recently. If you want your site to be suggested more often by ChatGPT, Gemini etc then you’ll need to make some adjustments. I suspect in 2026, this will start to become even more important.

Note: None of these are standard yet and may be removed in the future.

Let’s share what I have learned so far:

Create a /llms.txt file
Create a “LLM-readable” structured page (often called /for-llms)
Add Schema.org data tags to parts of your html. For example /projects
Being consistent with naming and phrases like your website name or like writing “Ruby on Rails” instead of “RoR”

I’ve included links to help show you how I’ve done it, and you’re welcome to copy me.

Create a llms.txt file

From llms-txt:

A proposal to standardise on using an /llms.txt file to provide information to help LLMs use a website at inference time.

This is basically a way for LLMs to quickly figure out what your site is about and see what useful links it has to offer. For my website part of the core pages section is auto-generated:

## Core Pages
- [About Ben (For LLMs)](https://www.bendangelo.me/for-llms/)
{% for page in site.header_pages %}
- [{{ page.title }}]({{ site.url }}/{{ page.url }}/){% endfor %}

## Recent Posts
{% for post in site.posts limit:10 %}
- [{{ post.title }}]({{ site.url }}{{ post.url }}): {{ post.excerpt | strip_html | truncate: 120 }}{% endfor %}

## Optional

### Posts by Tag
{% for tag in site.jekyll-archives.enabled %}
- [{{ tag | capitalize }} Posts]({{ site.url }}/tag/{{ tag }}/){% endfor %}

- [Throw the Project](https://throwtheproject.com): A Flash-game studio active circa 2007.
- [Archive]({{ site.url }}/page2/): Older blog entries.

Create a LLM-readable page

This is a new way of giving llms more information about your site. I’ve not entirely sure if it will become standard yet but you can view my page at /for-llms.

This was made by me pasting in all my information into a LLM, Gemini, and asking it to write out a llm-friendly structured page. I figured a LLM would be able to write this page better than I could.

Short snippet:

# Ben (Bendangelo)

Ben, widely known online as **bendangelo**, is a software engineer, open-source contributor, and polyglot developer based in Canada. He specializes in **Ruby on Rails**, the **Crystal programming language**, **AI integration**, and **Mandarin Chinese**.

**Official website:** [https://www.bendangelo.me](https://www.bendangelo.me)

## What I Do

Make use of Schema org

Following the schema guide, this is about adding more context to text within your html pages. For example, I added new tags to my projects page to give LLMs more context on the kind of projects I maintain and what I work on. I even asked the LLM how to improve the schema of my page and it suggeted to add tags, license, and other metadata.

Example of projects.yml:

- title: free_dictionary_api
  description: Ruby client for the Free Dictionary API
  url: https://codeberg.org/bendangelo/free_dictionary_api
  language: Ruby
  license: MIT
  category: API Clients
  tags: [ruby, dictionary, api]

Then on the projects page, these tags are then outputted:

{% if project.language %}
 itemprop="programmingLanguage" content="{{ project.language }}">
{% endif %}

{% if project.license %}
 itemprop="license" href="https://opensource.org/licenses/{{ project.license }}">
{% endif %}

 itemprop="description" class="text-base text-gray-700 dark:text-gray-300 mt-2">
    {{ project.description }}

 itemprop="author" content="Ben">

Consistent naming

This should be self-evident. Make sure you use consistent naming for yourself, your projects and anything about yourself. This will help the AI know you’re referring to the same thing.

Self-Host any Docker image using Kamal

2026-01-12T12:17:00-05:00

I been using Kamal for the past year in my own Ruby on Rails projects. And it’s fantastic:

One yaml file for config
Light-weight
Fast deployments
Command line tool
It’s open source
Can build / pull any docker image
Automatic SSL certs
Manages env vars
Can deploy multiple docker services
New does not require a repository for custom images

So naturally I wanted to also use this for my own self-hosting. But this tool was not developed for deploying premade docker images, so I had to make a work around. It basically involves creating a wrapper docker file that just pulls the original image. This isn’t ideal, but it works with no issues.

I haven’t seen this documented anywhere, so I decided to share this.

Setup

I’ll show you how to install Vikunja on your own server. If you don’t know, this is an open-source todo app, and it’s fairly lightweight.

Install Ruby

gem install kamal
mkdir vikunja
mkdir vikunja/config
touch vikunja/config/deploy.yml
touch vikunja/Dockerfile

Creating the deploy.yml

The basic deploy.yml looks like this, replace it with your server:

service: vikunja
image: myname/vikunja

servers:
  web:
    - myip

proxy:
  ssl: true
  host: myhost
  app_port: 3456

registry:
  # Points to your local Kamal registry tunnel
  server: localhost:5555

ssh:
  user: app

builder:
  arch: arm64 # change this to make the arch
  context: .

env:
  clear:
    VIKUNJA_SERVICE_PUBLICURL: https://myhost
    VIKUNJA_SERVICE_ENABLEREGISTRATION: "false"
    VIKUNJA_DATABASE_PATH: /db/vikunja.db
  secret:
    - VIKUNJA_SERVICE_JWTSECRET

volumes:
  - "/data/vikunja-files:/app/vikunja/files"
  - "/data/vikunja-db:/db"
  - "/data/vikunja-cache:/.cache"

Creating Secrets

If you don’t want to commit your secrets, like VIKUNJA_SERVICE_JWTSECRET, you can create a new directory:

mkdir .kamal
vim .kamal/secrets

And in this file paste in your secrets:

VIKUNJA_SERVICE_JWTSECRET=mysecretcode12344

Creating the Dockerfile

This file just wraps the original image.

# Inherit directly from the official Vikunja image
FROM vikunja/vikunja:latest

# No changes needed, but you can add custom labels if you want
LABEL description="Vikunja wrapped for Kamal deployment"

Deploying

cd vikunja
kamal setup
kamal repository setup
kamal deploy

Final Tree

Your dir structure should look something like this:

.
|____README.md
|____.gitignore
|____Gemfile
|____.kamal
| |____secrets
|____config
| |____deploy.yml
|____Gemfile.lock
|____Dockerfile

I used a Gemfile to lock the version of Kamal, but this isn’t necessary.

Good luck!

On Writing the TOCFL B Band speaking and writing test

2026-01-01T14:52:00-05:00

Not much information is online about taking the TOCFL B Band test, so I wanted to share my experience and hopefully help others better prepare for it. One important thing, I only took the speaking and writing tests, because the reading and listening tests had no seats left. At the time of this writing, the speaking and writing can only be done once a year in Taiwan, so I jumped on it as soon as I could.

Results: Speaking B1 and Writing B2

So after waiting a month, the results are in. I only had a month of study before the test because it was a last minute decision. Although stressful, I believe it was worth the effort to have verification of my Chinese ability. I can now say I’m fluent in writing Chinese (B2 is the widely accepted level for being considered fluent), and almost fluent in speaking.

I should have gotten B2

One more star and would have gotten B2

Had to wait a month for this

You get a review of your answer

Been learning Chinese for 12 years

So.. yeah.. I expected more stars because I speak and write Chinese everyday, while living in Taiwan. I did not expect to just barely get B2.

Some background, I’m a white Canadian living in Taiwan. It’s been a few years now, going back and fourth. Some people have said, I must have some Chinese ancestry because my Chinese is decent, but I do not and I don’t consider my Chinese to be amazing.

I started learning Chinese when I was 23. Because of the Chinese isn’t easy, in the past I wanted to give up many, many times, but after meeting other westerners at a very fluent level. This totally shifted my perspective from I’ll always have broken Chinese, to it is possible to be fluent. This made a massive impact on my language learning and this is why I’m writing this. Just remember make small daily improvements maybe 1% or 2% a day and it’ll snowball.

Quick tips

Chinese definitions are much more clearer

For beginners Du Chinese, reading and listening at the same time is the best method.
Buy the Chinese dictionary for Pleco, it’s worth it.
Read Dcard everyday to learn how locals talk.
Download the CNA app, it’s Taiwanese news and it can read it out to you.
Reread books you read in English and read it again in Chinese with audio. I reread How to win friends and influence people.

Why not HSK?

I’m currently in Taiwan and it’s not offered. And from my research, TOCFL is a better test on fluency, it’s not possible to memorize answers or look for keywords. In terms of the speaking test, they don’t care how many Chinese idioms you use, or your vocabulary, it’s all about expressing yourself. So I felt it’s more accurate.

I did this for my own personal goal

This is a test for non-native Chinese speakers to test their language abilities. I finally wanted the ability to say I’m fluent in Chinese, because a lot of people claim it but that statement is hard to backup. So, at least I can say I can write fluent Chinese?

Also in the future, this could be good for my career and doing business. In my mind, it’s always good knowing a language. In fact, in the many years of learning Chinese, I have actually improved my communication skills in English and understand my culture a lot better. Mostly because when speaking your native language you usually don’t put much thought into the words you use but after learning Chinese I was focused to re-evaluate which words I use and what I’m actually trying to convey.

Bands and grading

So TOCFL has three bands, A, B, C. A is beginner, less than a year. B is the mid-tier and it’s recommended be taken before the C band (because most people overestimate their abilities). C band is the highest. Within these bands you have two possible levels, 1 (lower), 2 (higher). So I took the B band, and there are only three possible results:

Fail
B1
B2 – this was my target

About C, this is the highest level and from my experience, few people can achieve a C1 level in speaking and writing. And even fewer can attain the legendary C2 level, not because it’s impossible but because most people realize the amount of effort required, makes it just not worth it. A B2 in writing and speaking is well suitable for daily life, reading papers, articles etc. I personally know someone who has lived in Taiwan for 15 years, speaks perfect Chinese, literally knew every random Chinese idiom I could say – only one she didn’t know, but could infer the meaning –, and she only got a C1 in reading and C1 in listening. The juice isn’t worth the squeeze.

I’ve also seen a few people online that claim they got a C2 in reading, or listening or whatever, while studying outside the country. I have a very hard time believing that. Even TOCFL suggests living in a Mandarin speaking country will double your learning speed. So living in America and learning fluent Chinese in 3 years? I want to see your test score.

Reviewed example answers

In preparation for the test, first, I reviewed these answers to get an idea of what to expect.

Created a daily practice prompt

From looking at the test requirements, like the topics covered, example questions. I ended up using AI to help generate a prompt which I used in my daily studies. It’s not perfect, because I asked the AI to grade me based on the TOCFL grading chart and it almost always gave me B2. At one point it even upgraded me to C1, but of course in reality I only got B1. So don’t let the AI fool you, whatever level you think you are, it’s not enough.

At the end I will share the prompts I used.

Writing practice method

I practiced writing the most, by writing one or two essays a day. Typically, I would ask AI to write out an example answer based on the grading rubric. From that I would memorize the essay by rewriting or rewording it. Trying to improve my style and expression ability.

After that I would set a time and write an essay from whatever the AI gave me.

After writing, I’ll ask the AI to correct and grade. From there I would then rewrite the essay to learn from my mistakes.

This whole process would take at least 2 or 3 hours a day. Sometimes if I just couldn’t express a certain topic well, I would practice it all day.

Example answer

Here is one of my essays from one of my practice sessions. Yes I know Social Media is misspelt, I’m on Linux and the keyboard gives the alternative character.

📝 題目：

「有人認為社群媒體讓人更接近，也有人認為它讓人更疏離。你同意哪一種看法？」

近年來，社羣媒體越來越普遍，不管在哪裏能讓人保持聯絡。雖然帶來一些好處，但我還認爲社羣媒體對大多數人造成負面影響。

首先，社羣媒體確實讓人更接近，透過手機隨時隨地能與家庭朋友保持聯絡，但是網友之間的交流與面對面的不同。線上的人往往只會分享他們的最完美的一個面，忽略他們的沒那麼好的時間，這樣會讓他們的朋友們感到很疏離。因爲實際上的生活差很多，因此讓許多人沒那麼接近。

其次，社羣媒體鼓勵人線上聯絡，輕視面對面交流。因爲這個社羣媒體的平臺公司目的是賺錢，他們的功能都鼓勵人留在線上，但是最好的交流方式是面對面。這個方式比較自然，也並沒有時差或誤會機會。大家都知道面對你的朋友是有許多好處，譬如看到別人的表情會幫你理解別人。

總而言之，雖然社羣媒體帶來許多便利功能，但還導致讓人更疏離。最理想的方式是人多面對面之間交流，互相幫忙，才能做更好的平臺。

Since I got B2 in writing, I’ll say my practice sessions worked well. No changes to be made here.

Speaking practice method

I would wake up every morning, with my big mic and ask the AI to give me a new list of questions based on the tests rubric. Here is an example:

太好了！你設定的學習計畫非常完整，我會完全依照**華語文口語測驗（華測口試）B2級（高階級）**的官方格式，幫你模擬每日測驗、評分與回饋。
那我們現在開始 Day 1 模擬測驗（B2級） 🎯

🗣️ 華語文口語測驗 B2 — Day 1 模擬題
Part 1：熱身題 Warm-Up（約30–45秒）

👉 題目：
「你平常喜歡怎麼安排週末的時間？你通常會做些什麼活動？」

Part 2：經驗描述 Experience Description（約1.5分鐘）

👉 題目：
「請描述一次你參加特別活動或比賽的經驗。那次活動是什麼？你當時的感受如何？你從中學到了什麼？」

Part 3：圖片描述 Picture Description（約1.5分鐘）

👉 圖片說明（想像場景）：
「圖片上有三位大學生坐在咖啡廳裡，他們一邊喝飲料，一邊用筆電討論專題報告。
請描述他們的情況，並談談你對‘團隊合作’的看法。」

Part 4：陳述意見 Opinion Statement（約2分鐘）

👉 題目：
「現在很多人覺得『網路讓人們之間的關係更疏遠』，也有人認為『網路讓人們更容易保持聯繫』。
你同意哪一種說法？請說明你的理由。」

📋 請依照口試形式，用自然的口語中文回答這四題。
你可以輸入成文字（口語化即可），或用語音錄音回答。

當你回答後，我會依照官方三大評分面向：

🧩 內容組織（Content & Organization）

💬 表達能力（Expressive Ability）

🀄 語言運用（Language Use）

給出每題的等級評分（A1–C1）、具體優缺點分析、範例高分回答，
並提供今日總結＋明日練習建議＋30秒口語訓練題。

請開始你的 Day 1 模擬測驗：
🎙️ 請從 Part 1：熱身題 開始回答：「你平常喜歡怎麼安排週末的時間？」

I would then go through all the questions, with a timer and using the built in ChatGPT recording feature. But I wish it had a built-in timer.

Afterwards, it’ll correct me and give a score. The score wasn’t too helpful because it praised me too much. However, it’s feedback was always useful and lead me to the next steps.

After the correction I would repeat it a few times, and maybe again the next day.

I feel like my practice sessions kinda helped, but it would have been better to pick one topic and expand on that for longer periods of time. I have a hard time going deeper into a topic and really expressing my point of view well. I feel this is what led me to only getting a B1 score.

Example answer

To give people an idea of my speaking level. Here is an example answer I gave to ChatGPT during one of my practice sessions. I said all of this within two minutes. Please note, some of the transcriptions are not correct but it’s about 99% accurate.

🎯 今日題目（陳述意見題）

「有些人覺得早起對身體和生活比較好，
也有人覺得晚睡晚起比較符合現代人的生活方式。
你覺得早起還是晚起比較好？請說明你的看法並提供至少兩個理由。」

我覺得早起對身體和生活比較好。首先,早起床的人是比較健康的習慣, 因為起床後你就可以去運動,還是吃早餐,然後上班。這個習慣對身體比較健康。其次,我覺得很早起床,如果你很晚起床,那可能會超過早餐營業的時間。這樣比較難去你想吃的餐廳,因為你睡太久。當然晚起床有一些優點,譬如說晚上比較安靜的環境,你可以專注你想要做的重要的事情。但總的來說,我覺得早起是比較好的生活習慣,因為比較健康。也可以早起床可以做你想要做的事情。喝完水比較不好的習慣。

Speaking fluently in this kind of setting is quite difficult. You basically have to juggle multiple issues at once:

Time cannot be wasted thinking, so you’ll need a backlog of useful phrases. For example: 我給你一個例子 or 不一定會
Decide your two points before speaking, and then on the fly, expressing it.
From each point, give examples from your personal experience.
Trying to give a logical, thoughtful answer, while the timer is counting down.
You cannot spend any time thinking about Chinese grammar, words or language.

In the future, I would recommend discussing this kind of topic with a partner for at least 10 minutes. Digging into all areas you can. Yeah it’ll be boring, but your speaking ability needs to be so well rounded, you can focus all your energy on your answer.

The actual test experience

Speaking test

So on the day of, I was nervous. My thought process was to practice answering these types of questions everyday for a whole month for about an hour. This would create a habit and so my brain would just be able to spit out answers without thinking about it too much. This mostly worked, but my ultimate downfall was not speaking in greater lengths of time, and practicing answering hypothetical questions. You’ll see why.

I should note, you have a minute to prepare for the harder questions. And a pen and paper are provided for taking notes. I mostly wrote down the points I wanted to make.

Two warm up questions

So the test starts with two warm up questions, which don’t count for anything. It’s basic stuff, like introduce your family etc.

Why did you learn Chinese?

This was a walk in the park. I answer this question everyday.

Ideal job?

I don’t totally remember this question, but I believe it was what kind of job would you like and why? Is money important etc? This is a common thing for me to talk about, so no issues.

Describe the picture

So for this part, a picture is shown and you need to describe the situation and if you’ve experienced it before. I got a picture of an old woman trying to give a package to a post office worker, but according to the sign behind her, she packaged it wrong.

This was one of the harder questions, because in real life, you almost never describe a situation from a picture. I kinda fumbled and forgot a lot of specific post office words.

Which character would you be?

This was the hardest question. I had to pick a character from a movie or a book. I chose Spiderman, because I knew how to say this characters name and mostly how to describe everything about him. But then I realized, I didn’t know how to say shoot spiderwebs or swing from buildings. This is a case of lack of experience, no one asks me these types of questions ever.

Would you rather travel young or old?

Now this is the argumentative section. One minute to prepare and you must pick a side. For this I chose I would rather travel when I’m younger. Because I have more energy, time, and it suits my travel style better. This sounds easy, but you have to talk for two minutes straight and can’t repeat yourself.

There is an argument at a restaurant, how to resolve?

This one was strange but I felt I explained myself well. Basically, there was an argument where one person was unhappy with their meal and the owner will not refund or offer any concessions. You had to share your thoughts and who you agreed with.

Writing test

You have the option of writing this in pinyin or in zhujin. And this is not written on paper, it’s on the computer.

Question 1: Write a letter

This was about your experience studying abroad. I used a Chinese idiom, but that didn’t help me in getting 4 stars.

Question 2: Write an argumentative essay

The question was, a construction company was to build a new shopping mall in a small village. Pick a side and explain if this is good for the community or bad. I ended up getting 4 stars for this. I wrote a pretty formal, strong essay and pulled out every trick I could. I started out explaining the pros and cons, and then ultimately saying this is good for the city because it’ll invite more investment and improve every ones quality of life.

Practice Chinese via Charades

So I created a free mobile app to help people learn languages with friends and in meet ups. If you can take a quick look and download it, that’ll help out out. Getting your app visible on the app store is not easy anymore. Thanks!

For now it’s only on ios It’s called Word Head.

Writing prompt

This is what I used for my daily writing exercise. I would set a timer and write out whatever example topic was given. Source of requirements

📘 Role & Goal:
You are my Chinese writing coach and examiner. I am preparing for the 華語文寫作測驗 and want to reach B2 (高階級) level. Please evaluate my writing strictly according to the official rubric.

🧭 Step 1 — Topic Selection:
Each day, provide one B2-level writing task (450–550 words). The task should be either:

書信寫作 (formal/semi-formal letter):

Include proper Chinese writing format: 收件人、寄件人、日期、稱呼、正文.

Convey emotions and highlight the personal significance of events.

Comment on messages or viewpoints of the recipient if appropriate.

Include all three required content points in the task.

Be clear, detailed, and logically structured.

Use proper paragraphing; each paragraph should have 3–6 sentences.

Use Chinese punctuation.

Strictly 450–550 words.

觀點論述 (argumentative essay on a social issue):

Express only one clear position (支持 or 反對) throughout.

Provide reasons and concrete examples supporting the viewpoint.

Examples must be relevant, persuasive, and logically connected.

Organize ideas clearly; use cohesive transitions.

Use proper Chinese punctuation.

Strictly 450–550 words.

Additional rules:

The topic must be realistic and culturally relevant, similar to the actual test.

Always specify whether it is 書信寫作 or 觀點論述.

Include a short scenario describing the context of the writing task.

Example (觀點論述):

有些父母讓念小學的孩子帶手機去學校，但他們也擔心這樣做可能對孩子有不好的影響。請你針對 “小學生帶手機去學校” 這個議題表達你的看法，內容必須明確表達你的支持或反對立場，並提出充分理由支持你的觀點。

📝 Step 2 — My Writing Submission:
I will write my essay (450–550 words) according to the topic you provide.

🧮 Step 3 — Evaluation:
After I submit my writing, please:

Evaluate using the official criteria:

情境任務的符合度與充實度 (task relevance and completeness)

結構完整性與句法正確度 (organization and grammar accuracy)

詞語適切度與豐富性 (vocabulary appropriateness and richness)

Give a score estimate (A1–C1) and indicate whether it meets B2 standard.

Provide a corrected version with improved grammar, vocabulary, cohesion, and structure.

Offer 2–3 specific improvement tips.

🎯 Step 4 — Daily Improvement Focus:
Provide a micro-goal for my next writing session, e.g.:

Practice using 因為…所以… / 不僅…而且…

Focus on cohesive paragraph linking

Enhance vocabulary richness

✅ Instructions for Daily Use:
When I start, say:
🧑‍🎓 “Let’s begin today’s B2-level writing practice. Please give me a writing task.”

Speaking prompt

Below is the prompt I used in my daily speaking exercise. I would paste this into ChatGPT and then use the microphone feature to record my voice and then submit it. I also timed myself as to not go over / under the minimum time. For this test you must keep talking until the time runs out. Source of requirements

🎯 Goal:
I’m preparing for the 華語文口語測驗 (Chinese Oral Proficiency Test) and aiming to achieve B2 (高階級) level within one month.
Please act as my official oral test coach and evaluator, using the official rubric for structure, scoring, and feedback.

🧭 Step 1 — Complete B2 Daily Test Simulation

Each day, please give me 4 questions, covering all core task types from the 進階高階級 (B2 Band) of the official test:

Part 1: 熱身題 (Warm-Up)
→ Ask 1 short personal question about daily life or interests to help me start speaking naturally.
Example: 「你平常喜歡怎麼安排週末的時間？」

Part 2: 經驗描述 (Experience Description)
→ Ask me to describe a personal experience, past event, or situation.
Example: 「請描述一次讓你印象深刻的旅行經驗。」

Part 3: 圖片描述 (Picture Description)
→ Describe a given image or imagined scene (you can describe it in text form).
Example: 「圖片上有幾位年輕人在公園裡撿垃圾，請描述他們的活動，並談談你對環保的看法。」

Part 4: 陳述意見 (Opinion Statement)
→ Give me a general or social topic. I must present a clear opinion, support it with at least 2 reasons, and use examples.
Example: 「有人認為學生不應該打工，也有人覺得打工有幫助。你怎麼看？」

🎙️ Step 2 — My Speaking Responses

I will respond in spoken Chinese (voice recording or written “spoken-style” text).
Each answer should last about:

Part 1: 30–45 seconds

Part 2: 1.5 minutes

Part 3: 1.5 minutes

Part 4: 2 minutes

🧮 Step 3 — Official Evaluation Criteria

After I respond, please evaluate each part based on the 華語文口語測驗 B2 標準:

1. 內容組織 (Content & Organization)

Is the message clear, complete, and logically organized?

Did I provide examples or reasons?

2. 表達能力 (Expressive Ability)

Is my speaking fluent, natural, and easy to understand?

Did I connect ideas smoothly using cohesive phrases (例如：首先、另外、因此、總之)?

Did I express emotions or opinions appropriately?

3. 語言運用 (Language Use)

Are my sentences grammatically correct?

Did I use appropriate, rich, and accurate vocabulary?

Is my pronunciation/tone natural and accurate?

🧾 Step 4 — Feedback and Scoring

For each question, please:

Give me a score (A1–C1) and note if it meets B2 criteria.

Highlight specific strengths and weaknesses (for example: fluency, vocabulary, tone accuracy, idea development).

Provide an improved version or model response (natural, test-style Chinese).

Summarize overall feedback with an estimated overall band level (e.g., “Your current performance: B1+, approaching B2”).

💡 Step 5 — Targeted Improvement Plan

After the evaluation, please give me:

2–3 focused improvement tips for the next day (for example:
“Work on expressing comparisons using 比較…” or
“Practice explaining reasons clearly using 因為…所以…”).

A short daily speaking drill (30–60 seconds), such as:
“Describe your day using only complex connectors (雖然…但是…, 不但…而且…).”

🧩 Example Use

👩‍🎓 “Let’s begin today’s speaking practice. Please give me all 4 questions (熱身題、經驗描述、圖片描述、陳述意見) following the official B2 test format.”

Made A Language Learning Charades Game

2025-10-27T11:39:00-04:00

Hi everyone. I’ve joined many in-person language exchanges and always found it hard to get the whole group engaged or get something out of it. Maybe someone is a beginner or they can’t really teach their own language well. The idea with this game is anyone from any level can practice both their target language and help others practice theirs. It’s free, no ads and currently only on ios. The game is called word head. It’s focused mostly for beginners and B1. The idea is, in a group, one person has to guess their target language word from a set list of words in a category. Everyone else has to give hints (in your target language or your native). For example, if they choose an easy difficulty and the category is Food, they have to guess from a list of 20 foods. Afterwards you can swap the language and the other players can practice from the English words.

Languages:

English
Español
Français
Deutsch
Italiano
Português
Arabic
中文繁體
中文简体
日本語
한국어

Can choose from many categories

For now it’s only on ios: https://apps.apple.com/app/id6753888185 It’s called Word Head.

I’m also looking for feedback:

I’m considering removing the hard difficulty level and combining it with Easy / Medium + 10 harder words.
I’m also considering adding a hint function to give a person a hint in their target language how to give hints for the word. For example, the word is “hot”, the hint to explain the word might be “not cold”

Using whichlang.cr for language detection

2025-04-13T22:16:00-04:00

whichlang.cr provides Crystal bindings for Rust’s whichlang language detection library. It detects the language of a given text.

Supported languages

Arabic (Ara)
Chinese (Cmn)
German (Deu)
English (Eng)
French (Fra)
Hindi (Hin)
Italian (Ita)
Japanese (Jpn)
Korean (Kor)
Dutch (Nld)
Portuguese (Por)
Russian (Rus)
Spanish (Spa)
Swedish (Swe)
Turkish (Tur)
Vietnamese (Vie)
Unknown (for undetected text)

Requirements

Linux or Mac
Rust 1.56 or later

Installation

Add this to your shard.yml:

dependencies:  
  whichlang:  
    github: bendangelo/whichlang.cr  

Then run:

shards install

Usage

require "whichlang"  

input = "this is the best day ever"  
Whichlang.detect(input) # => Whichlang::Lang::Eng  

# Returns nil for empty or undetectable text  
input = " "  
Whichlang.detect?(input) # => nil  

That’s it. You now have language detection in your Crystal app.

Using marisa.cr for Efficient String Storage in Crystal

2025-04-11T23:27:00-04:00

The marisa.cr Crystal shard gives you access to the powerful Marisa Trie data structure, perfect for storing and searching strings efficiently. Let’s look at how to use it.

First, install the shard by adding it to your shard.yml:

dependencies:
  marisa:
    git: https://codeberg.org/bendangelo/marisa.cr.git

Basic Usage

Create a trie and add some strings:

require "marisa"

trie = Marisa::Trie.new
trie.add("snow")
trie.add("snow cone")
trie << "ice cream" # same as add

You can search for strings:

trie.add("ice")
trie.search("ice").keys
# => ["ice", "ice cream"]

Check if a string exists:

trie.include?("snow") # => true

Working with Weights

Add strings with weights (useful for prioritization):

trie.add("ice", 1_f32)
trie.get_weight("ice") # => 1.0e-45_f32

Bulk Operations

Add multiple strings at once:

trie.add_many(["icicle", "snowball"])

Iterate through all keys:

trie.each do |key|
  puts key
end

Saving and Loading

Save your trie to disk:

trie.save("winter.trie")

Load it later:

trie = Marisa::Trie.new
trie.load("winter.trie")

Specialized Tries

For binary data:

bytes_trie = Marisa::BytesTrie.new("one" => "1", "two" => "2")
bytes_trie["one"] # => "1"

For integer values:

int_trie = Marisa::IntTrie.new("one" => 1, "two" => 2)
int_trie["one"] # => 1
int_trie.sum("one") # => 4 (sums all matching entries)

Advanced Options

Customize your trie:

trie = Marisa::Trie.new(
  ["test"],
  [1.0_f32],
  binary: true,
  num_tries: 10,
  cache_size: :large,
  order: :weight
)

The marisa.cr shard is a great choice when you need compact, efficient string storage with fast lookup capabilities. Give it a try for your next autocomplete or search feature!

Why Can’t I Find My App On The Google Play Store?

2025-04-01T11:11:00-04:00

So 6 months of hard work building my first Flutter mobile app. I release it on the Google Play store (and Apple store) and it ain’t showing up on search. So what gives?

Searching for my app but it’s no where to be found

The app is called English Fox (it also has a Chinese name 英狐狸), it’s an English-Chinese dictionary to help people learn English. You can find the direct link here and see for yourself. Searching by its full name still gives no results. I can accept if searching for “English Dictionary” does not display it in any of the top spots (after all I just released it) but I would expect it to appear when entering the apps name?

This burns because I paid money to be on the Google Play store (paid the sign up fee) so I would expect more visibility. I’ve had friends and others tell me they couldn’t find the app on the store, so this obviously is hurting my marketing efforts quite a lot.

Possible explanations

Let’s dig into some reasons according to my searches:

If you are a developer and have just released your app, it can happen because Google Play needs some time to index your app to add it to search results. Try to search your app later (in 4-5 hours). E.g., for App Store it takes around 24 hours.

So it’s been about two weeks or more on my end. I’ve double checked the keywords, pushed updates, people have been downloading it. What gives?

More installs may help speed up indexation (including indexing for your own title), so running a few paid campaigns around launch. Having good user retention for queries (again including your own title) will help to maintain those rankings.

People have been downloading and using the app. But to be honest not a lot but isn’t this a chicken and egg problem? How can you prioritize indexation based on how many people are using the app if no one can find it? I would expect the app to show up somehow. Or is this an attempt to get more money by pushing devs to paid advertising?

You may want to include your title in additional areas to build stronger relevance overall. A few additional areas to consider: App ID, Short Description, and Long Description.

This is a good tip but doesn’t explain why searching the exact title of my app still doesn’t return it as a first result.

Some more info here on Stackoverflow.

The good old days

I remember launching Flash games around 2007 on the web and getting hundreds of thousands of plays within the first week. All you had to do was get on the front page of a popular website and you’re golden. If you didn’t get so lucky, you could still make it by just posting it around various websites and it’ll slowly build visits (of course you’ll send it around a bit). Then people from playing the game would then find your website (because you added links inside the Flash game) and then you can get their emails and build a community. It was so easy back then.

Now you need a to be approved by these app stores and they determine how much visibility your app gets, and then the person needs to install your app. Also let’s be honest, it was less friction to play a Flash game back then, all it took was one click on a link and you’re playing the game. Nowadays you need to visit the app store, search for it, download the app, wait, wait, then load it up etc. It’s not as streamlined as before.

Anyway if anyone knows what’s going on please let me know.

Ruby Gems For Pulling Dictionary Words

2025-01-09T18:10:00-05:00

For the past few months I’ve been working heavily on an English Dictionary database, to help people learn English. It’s called English Fox, more on this next time.

Generate Definitons via LLM

Anyway I needed a way to pull definitions for a lot of words. I tried using a LLM to generate definitions but it missed some parts of speech or gave bad results, totally unreliable, unless using a massive model 70B+.

For example, take this prompt:

List all the words of speech for the word "back",
include the part of speech and a short definition.
In yaml format.

From ChatGPT:

back:
  - part_of_speech: noun
    definition: The rear part of the body or something that is opposite the front.
  - part_of_speech: adjective
    definition: Related to the rear or past.
  - part_of_speech: adverb
    definition: Toward the rear or in the past.
  - part_of_speech: verb
    definition: To support or to move backwards.
  - part_of_speech: preposition
    definition: Behind or at the rear of something.

Hey, this looks pretty good. Why not run this for all 200,000 words?

A few issues:

That’s expensive
It will 100% miss some definitions

Let’s take the definitions from Free Dictionary API.

It looks pretty good, but…

...
{
    "definition": "A support or resource in reserve.",
    "synonyms": [
      
    ],
    "antonyms": [
      
    ]
},

Yeah it missed this important meaning. And about 10 more.

Ruby Gems

So I arrived at the expected solution. Get a large word list and slowly scrape the words and sort the data from there. I ended up using this list and a few others. Github has a few lists, this one by gnu works as well.

The next task is to pull the definitions. Each word has many definitions, and each definition has a part of speech, meaning and other important data. Most of this can be scraped but some of it is generated by a LLM – for example the synonyms.

Here is the staging model for definitions to help you visualize:

  create_table "staged_definitions", force: :cascade do |t|
    t.bigint "staged_word_id"
    t.string "part_of_speech", default: "0", null: false
    t.string "meaning", null: false
    t.integer "source", default: 0, null: false
    t.string "group"
    t.string "synonyms", default: [], array: true
    t.datetime "created_at", null: false
    t.datetime "updated_at", null: false
    t.string "antonyms", default: [], array: true
    t.integer "position"
    t.integer "state", default: 0, null: false
    t.string "full_form"
    t.index ["staged_word_id"], name: "index_staged_definitions_on_staged_word_id"
    t.index ["state"], name: "index_staged_definitions_on_state"
  end

I ended up creating some ruby gems to share with the community for pulling word definitions from free apis.

Note: One of them is Cambridge, which isn’t free and I don’t use anymore. I wouldn’t recommend scraping from them, use at your own risk.

In the end I realized almost all the original definitions came from Wiktionary, so I ended up only using that. All the apis give slightly different data, so pick the one that best suits the project.

Enjoy. More on English Fox in my next post.

Simple Explanation of the Pinyin Sounds

2024-09-06T21:52:00-04:00

After learning Chinese Mandarin for over 10 years. I feel I should write a guide on how to pronounce Mandarin and so give something back to the language learning community. This guide will go over all the pronunciations for specifically Taiwanese Mandarin but it’s about 90% the same as mainland China. This is everything I wish I knew when I started learning.

Tones

Let’s start with the hardest part – but don’t think impossible to learn, just it will take more time. You probably heard Mandarin has four tones. See the picture below:

There is also a 5th neutral tone

Each character in Chinese has a tone, and sometimes multiple tones, but this isn’t the right way to think about it. Because when speaking Chinese your meaning involves two components, the pronunciation and the tone. These two things together creates a word and therefore meaning. This is the most important thing to remember.

This is completely different to English. Where in English your pronunciation is the meaning (usually) and your tone gives feeling.

For example, if I say: today was a great day or today was a greaaaaat day. The meaning is the same but the tone is different, this emphasizes my feeling. Chinese does not work this way. Your tone and pronunciation is intrinsic, it’s always together and can’t be changed.

English also has tones, an other example: All board! or all aboard? The question mark changes the meaning of the sentence, and when reading this question out, you’ll do an upward inflection – this is a tone. This is natural and you don’t even think about it. Chinese is a lot like this, but there are four tones and each word has it’s own special tone.

So when learning Chinese think in terms of tone and pronunciation together, the best way to do this is by hearing and repeating or else you’ll never pronounce anything right.

I’ll also take this opportunity to explain because of tones being meaning. Mandarin speakers do not convey their feelings through tones like we do. This is why Mandarin speakers speaking English have this kind of deadpan tone in their voice, it’s not natural to convey meaning with tones. Instead to convey feeling in Chinese you would use a Chinese particle ie: 吧，啊，耶，啦.

Consequently, this means there is no exaggeration and sarcasm in the language because the tone is part of your meaning, not your feeling. Yes, you can still do these things with your friends etc but in Chinese people always speak accurately. Exaggeration and sarcasm both rely 90% on tone so they generally don’t work unless very obvious.

A commonly confused word is 為. If you look it up you’ll get multiple different definitions, it’s because this word has two meanings depending on the tone. Remember the tone is the meaning, so when you hear it or read it. You need to know the tone being used, if you’re ever confused just ask someone to pronounce it for you. In Chinese a big issue with the language is you can’t pronounce a word just by looking at it and you also don’t know the tone.

So 為 has two meanings, wèi (fourth-time) and wéi (second-tone). Remember the tone is part of the meaning. So wèi means for and wéi means to be. This is just one example but they are everywhere, example: 更 gèng (more) or 更 gēng (to change or replace). This explains why 更新 means to renovate and not more new. This is also why 更衣室 means change clothes room, not more clothes room.

One more example for people in Taiwan. A common way to save your receipt is to 載具, the dictionary will write this as zài jù but it’s not that zài (to carry), it’s actually the third-tone zǎi (to record).

Note: This is what makes Chinese hard to learn online because a lot of flashcards or pinyin has the wrong tone causing confusion. Keep this in mind.

Bottom line: remember the tone and pronunciation always goes hand in hand.

First Tone (Deadpan Tone)

This is the deadpan tone, act like you have no emotion and go higher in volume. But you’re not singing, I see a lot of people try to sing this. It’s just a normal voice without any variation in tone. Like a zombie.

Just remember: no emotion.

Second Tone (Question Tone)

This is the question tone. It can feel strange because you feel like you’re asking a question but you’re not. Start at a normal tone and then quickly go up. Don’t go too slowly.

Just remember: you’re asking a question.

Third Tone (Annoyed Tone)

This one is probably the trickiest because it’s hard to hear. But this is the tone a teenager would make when they are annoyed. Imagine your mom is calling you and you’ll say What?! or YeaaaHHH.

For this you start as low as you can until you reach your limit, and because you’re so low you kinda naturally go up in tone. As you pronounce just keep letting the tone go up.

Just remember: start low, end high.

Fourth Tone (Angry Tone)

This one seems easy but people will kinda hold back because they feel the tone is too angry and that’s not good. Remember tones are meaning not feeling. Pronounce this like you’re angry and go straight down.

Just remember: you’re angry.

Fifth Tone (Quick Tone)

This is a character that is pronounced quicker than the others and doesn’t have a distinct tone. This is typically used for common daily words and is always the last character in the word, ie 媽媽 māma. I should note in Taiwan this is rarely used, it’s only seen for family names.

Just remember: quick pronunciation.

Tone Changing

Ok, sometimes because tones sound weird or unnatural the tone gets changed. This is kinda like in English you won’t say “a elephant” or “can’t”. “An elephant” is said because two vowels together doesn’t sound good and “cannot” is used because “can’t” is hard to hear clearly. Languages are built on listening first and then written down after the fact, which is why these things happen.

一 (yī)

Normally this is the first tone (and always is when counting, ie 第一個) but if it’s paired with anything else. It will swap to a second tone (question tone) or a fourth tone (angry tone). This is because it sounds nicer. Here is an article that goes into more depth.

The key take away here is don’t say yīyàng, instead say yíyàng.

不 (bù)

Samething as above.

The key take away here is don’t say bùyào, instead say búyào.

Double Third Tone (Annoyed Tone)

If two third tones appear beside each other. The first third tone becomes a second tone (question tone). This is because it’s just easier to pronounce.

Example from the link above: 很 (hěn) 好 (hǎo): (hén hǎo) means very good.

Practicing

For practicing I recommend getting a mic that can let you record as you speak. And try to mimic other people. This is called shadowing.

That’s all for tones! Everything after is much easier.

Pronunciation

Let’s start making words.

All the pronuncations in Chinese

To make a word in Chinese you will take one consonant from the left and one vowel (or more) from the right. It’s not as bad as it looks.

If you want to hear the sounds I recommend this site.

Emphasize Each Character

Chinese speakers think and read each character at a time. It’s not like English where you can kinda mumble whole words together. People need to hear your tone, you’re obivously learning, so speak slowly and clearly. I’ve seen plenty of native speakers speak really slowly and there is nothing wrong with that.

Like we learned before, tones are meaning – they can’t give feeling with tone, so people speaking Chinese will typically emphasize by speaking characters slower. This really drills home their feeling. It’s just a different way of communicating.

Just remember: spend like 200ms on each character and emphasize the tone.

Initials

The top three rows of initials (b, p, m, f, d, t, n, l, g, k, h) are similar enough to English. I’ll dive into the more difficult ones.

J / Q / X

For these but your tongue under your bottom teeth. J is like saying jeez but a lot stronger. Q is similar but it’s like chee. X is like shee.

Just remember: Tongue below bottom teeth and have a strong ee sound.

Zh / Ch / Sh / R

Tongue position is on the roof of your mouth.

Zh actually has a d sound infront, it’s more like dzher.

Ch is like cher.

Sh is like Sure.

R is the ending of sure, so it sounds like re.

Z / C / S

Z sounds like dz.

C sounds like ts (like in cats).

S sounds like (si).

Endings

A sounds like f-a-ther.

O sounds like owa. But heavier oh sound.

E sounds like u in put plus u in duh.

Second E sounds like a rapper, yeeee.

AI sounds like aye aye captian but heavier.

EI sounds like ehhh but heavier.

AO sounds like when you hit your knee. Aaaooouuu.

OU sounds like ooooh in English but heavier.

AN / EN / ANG / ENG

Most people can’t hear the difference between these but there is one. The an / en sounds are at the front of the mouth, think “gun”. The ng is used to show the sound comes from the back of your throat, think “lung”. Say these words a few times and you’ll see the sound is made differently.

Core Endings

These endings are used in many words. The way pinyin is written doesn’t show the actually pronunciation. When you see a compound word like “chuang”, it is actually “ch-wu-ang”. Learn the core pronunciations and then just combine them together. An other is “qiong”, it’s actually “q-yi-yong”. Zuan is actually “dz-wu-an”. Quan is actually “qv-en”, ignore the an, it’s actually en. Tian is actually “t-yi-en”.

The only exception is the -ong ending, like zhong is actually “dzh-own”, it’s not like “yong”.

Yi

Flat tongue, sounds like ee. Whenever you see an i or yi, always think “ee”. For example: Qiang is actually qee-ang, Bi is bee, yin is yeen (like in teen).

U / WU

This always pronounced as “oo”, like boo or when you see someone get hurt, you’ll say “ooooo”. It’s really deep and long. Whenever you see a wu or u it’s always pronounced as oo. For example: zhuan is dzh-oo-an, weng is oo-eng, zu is dz-oo.

Yu / Yv

This sound doesn’t exist in English. Start with a flat tongue and saying eee, then round your lips. You’ll make the “v” sound or u with two dots on top.

Yuan

This is probably the hardest sound. Ignore the “an”, it’s always “en”. It’s “yv-en”, start with yv and then quickly end it with en. You should see a transition happen.

Yun

This a little tricky but start with “yv” and then end it with a “een” sound like “teen”.

Taiwanese Pronunciation

Taiwanese people don’t curl the tongue, so shi and si sound the same. Also yin / ying sound the same.