kernel: added attention bench for profiling before optimization by guocuimi · Pull Request #360 · vectorch-ai/ScaleLLM

guocuimi · 2025-01-04T01:26:12Z


## attention_bench_sm80

### [0] NVIDIA GeForce RTX 4090

| batch_size | q_len | kv_len | n_heads | n_kv_heads | head_dim | HBWPeak | LoadEff | StoreEff | L1HitRate | L2HitRate | Samples | Samples |  CPU Time  |  Noise   | GPU Time  | Noise  | Samples | Batch GPU |
|------------|-------|--------|---------|------------|----------|---------|---------|----------|-----------|-----------|---------|---------|------------|----------|-----------|--------|---------|-----------|
|          1 |    64 |     64 |       2 |          2 |       64 |   7.35% |   0.00% |  100.00% |     0.00% |    76.67% |      2x |  54653x | 107.071 us | 1856.94% |  7.545 us | 10.23% |  95639x |  5.228 us |
|          1 |    64 |    128 |       2 |          2 |       64 |  11.91% |   0.00% |  100.00% |     0.00% |    68.22% |      2x |  40768x | 111.090 us | 1197.68% | 12.265 us |  7.78% |  49399x | 10.122 us |

guocuimi added 3 commits January 3, 2025 17:26

kernel: added attention bench for profiling before optimization

c7e0fbc

update

b677997

update type

3678905

guocuimi merged commit b78389c into main Jan 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kernel: added attention bench for profiling before optimization#360

kernel: added attention bench for profiling before optimization#360
guocuimi merged 3 commits intomainfrom
attn_bench

guocuimi commented Jan 4, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

guocuimi commented Jan 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

guocuimi commented Jan 4, 2025 •

edited

Loading