Skip to content

Task creation fails with invalid UTF-8 from Truncate splitting multi-byte runes #22375

@blinkagent

Description

@blinkagent

Bug

Creating a task fails with the following error when the task prompt contains multi-byte UTF-8 characters (e.g. em dash , curly quotes, etc.):

Internal error creating task.
pq: invalid byte sequence for encoding "UTF8": 0x97

Root Cause

Go's json.Decoder enforces valid UTF-8, so the prompt input from the HTTP request is always valid. The issue is in strutil.Truncate (coderd/util/strings/strings.go), which operates on byte indices rather than runes:

if len(s) <= n {  // len(s) is byte length, not rune count
    return s
}
...
_, _ = sb.WriteString(s[:maxLen])  // byte slice, not rune-safe

When a multi-byte UTF-8 character (e.g. em dash = 3 bytes E2 80 94) straddles the truncation boundary, it gets split, producing an invalid byte sequence that PostgreSQL rejects on insert.

This is called from generateFromPrompt() in coderd/taskname/taskname.go:

  • Task name generation: strutil.Truncate(prompt, 27, strutil.TruncateWithFullWords)
  • Display name generation: strutil.Truncate(prompt, 64, strutil.TruncateWithFullWords, strutil.TruncateWithEllipsis)

There's also a similar byte-slicing issue in generateFallback(): name[:min(len(name), 27)]

Additionally, strings.LastIndexFunc(s[:maxLen], unicode.IsSpace) in Truncate also byte-slices before searching for a word boundary.

Expected Behavior

Truncation should be rune-aware and never produce invalid UTF-8.

Suggested Fix

Make Truncate operate on rune boundaries instead of byte boundaries. For example, use utf8.RuneCountInString for length checks and iterate runes instead of byte-slicing. Also fix generateFallback to avoid raw byte slicing.

Existing tests only use ASCII strings, so adding multi-byte character test cases would prevent regressions.


Created on behalf of @bjornrobertsson

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions