Skip to content

Optimizing Index Computations in SWG for Improved Timing#1509

Open
preusser wants to merge 6 commits intodevfrom
feature/swg_opti
Open

Optimizing Index Computations in SWG for Improved Timing#1509
preusser wants to merge 6 commits intodevfrom
feature/swg_opti

Conversation

@preusser
Copy link
Copy Markdown
Contributor

No description provided.

@preusser preusser requested a review from STFleming January 20, 2026 13:43
Copy link
Copy Markdown
Collaborator

@STFleming STFleming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me and tests are passing on my end.

@auphelia auphelia requested a review from fpjentzsch January 21, 2026 11:02
@auphelia auphelia marked this pull request as ready for review January 21, 2026 11:02
Copy link
Copy Markdown
Collaborator

@fpjentzsch fpjentzsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, all tests passing on our end.

I did a small spot check on 16 exemplary SWG configurations and the average Fmax actually decreased from 194 MHz to 192 MHz. Maybe I didn't cover any cases where these changes would have a positive impact on timing.

Where there any specific edge cases that caused you trouble before?

@fpjentzsch
Copy link
Copy Markdown
Collaborator

@preusser Just out of curiosity I did another experiment with 224 different SWG configurations (using OOC synth with 200 MHz target clock):

@pytest.mark.parametrize("idt", [DataType["INT2"], DataType["UINT4"]])
@pytest.mark.parametrize("k", [[3, 3], [5, 5]])
@pytest.mark.parametrize("ifm_dim", [[16, 16]])
@pytest.mark.parametrize("ifm_ch", [8, 32])
@pytest.mark.parametrize("stride", [[1, 1], [2, 2]])
@pytest.mark.parametrize("dilation", [[1, 1], [2, 2]])
@pytest.mark.parametrize("simd", [1, 4, 8, 32])
@pytest.mark.parametrize("dw", [0, 1])
@pytest.mark.parametrize("parallel_window", [0])

fmax_comparison

Respective configurations with min/max Fmax:

[Before MIN] fmax=141.18 MHz
idt=UINT4, k=[5, 5], ifm_dim=[16, 16], ifm_ch=32, stride=[1, 1], dilation=[2, 2], simd=1, dw=0, parallel_window=0,
[Before MAX] fmax=196.89 MHz
idt=UINT4, k=[3, 3], ifm_dim=[16, 16], ifm_ch=8, stride=[1, 1], dilation=[1, 1], simd=8, dw=0, parallel_window=0,
[After MIN] fmax=139.14 MHz
idt=UINT4, k=[5, 5], ifm_dim=[16, 16], ifm_ch=32, stride=[2, 2], dilation=[2, 2], simd=1, dw=1, parallel_window=0,
[After MAX] fmax=198.85 MHz
idt=INT2, k=[3, 3], ifm_dim=[16, 16], ifm_ch=8, stride=[1, 1], dilation=[1, 1], simd=8, dw=1, parallel_window=0,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants