This repository was archived by the owner on Feb 25, 2025. It is now read-only.
[Impeller] improve morphology performance#37918
Closed
jonahwilliams wants to merge 1 commit intoflutter:mainfrom
Closed
[Impeller] improve morphology performance#37918jonahwilliams wants to merge 1 commit intoflutter:mainfrom
jonahwilliams wants to merge 1 commit intoflutter:mainfrom
Conversation
jonahwilliams
commented
Nov 27, 2022
| } | ||
|
|
||
| /// Flip coordinates if If `y_coord_scale` < 0.0. | ||
| vec2 IPRemapCoords(vec2 coords, float y_coord_scale) { |
Contributor
Author
There was a problem hiding this comment.
Not sure if this is actually that useful. Decreases arthimetic unit usage, increases varying or load/store unit usage
Contributor
Author
There was a problem hiding this comment.
this ends up with more effect on shaders like gaussian blur that sample N times per fragment
Contributor
Author
|
We're going to hold off on this until we have better support for shader variants in the engine code. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Improvements to morphology (dilate/erode) performance based on Arm mali guidelines. Doesn't noticably improve performance on iOS, but on a Pixel 6 improves full screen filter performance from 40-50 to ~70 FPS.
Summary of Changes
malioc results:
**[/Users/jonahwilliams/engine/src/out/android_debug_arm64/gen/flutter/impeller/entity/gles/morphology_filter.frag.gles]** [Mali-T880] Main shader =========== Work registers: 3 (75% used at 100% occupancy) Uniform registers: 1 (4% used) Stack spilling: false A LS T Bound - Total instruction cycles: 6.33 1.00 1.00 A + Total instruction cycles: 3.33 1.00 1.00 A - Shortest path cycles: 1.65 1.00 0.00 A + Shortest path cycles: 1.00 1.00 0.00 A, LS Longest path cycles: N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, T = Texture Shader properties ================= Has uniform computation: false [Mali-T860] Main shader =========== Work registers: 3 (75% used at 100% occupancy) Uniform registers: 1 (4% used) Stack spilling: false A LS T Bound - Total instruction cycles: 9.50 1.00 1.00 A + Total instruction cycles: 5.00 1.00 1.00 A - Shortest path cycles: 2.50 1.00 0.00 A + Shortest path cycles: 1.50 1.00 0.00 A Longest path cycles: N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, T = Texture Shader properties ================= Has uniform computation: false [Mali-T830] Main shader =========== - Work registers: 4 (100% used at 100% occupancy) + Work registers: 3 (75% used at 100% occupancy) - Uniform registers: 1 (5% used) + Uniform registers: 1 (4% used) Stack spilling: false A LS T Bound - Total instruction cycles: 9.00 1.00 1.00 A + Total instruction cycles: 5.00 1.00 1.00 A - Shortest path cycles: 1.62 1.00 0.00 A + Shortest path cycles: 1.25 1.00 0.00 A Longest path cycles: N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, T = Texture Shader properties ================= Has uniform computation: false [Mali-T820] Main shader =========== - Work registers: 4 (100% used at 100% occupancy) + Work registers: 3 (75% used at 100% occupancy) - Uniform registers: 1 (5% used) + Uniform registers: 1 (4% used) Stack spilling: false A LS T Bound - Total instruction cycles: 18.00 1.00 1.00 A + Total instruction cycles: 10.00 1.00 1.00 A - Shortest path cycles: 3.25 1.00 0.00 A + Shortest path cycles: 2.50 1.00 0.00 A Longest path cycles: N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, T = Texture Shader properties ================= Has uniform computation: false [Mali-T760] Main shader =========== Work registers: 3 (75% used at 100% occupancy) Uniform registers: 1 (4% used) Stack spilling: false A LS T Bound - Total instruction cycles: 9.50 1.00 1.00 A + Total instruction cycles: 5.00 1.00 1.00 A - Shortest path cycles: 2.50 1.00 0.00 A + Shortest path cycles: 1.50 1.00 0.00 A Longest path cycles: N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, T = Texture Shader properties ================= Has uniform computation: false [Mali-T720] Main shader =========== - Work registers: 4 (100% used at 100% occupancy) + Work registers: 3 (75% used at 100% occupancy) - Uniform registers: 1 (5% used) + Uniform registers: 1 (4% used) Stack spilling: false A LS T Bound - Total instruction cycles: 18.00 1.00 1.00 A + Total instruction cycles: 10.00 1.00 1.00 A - Shortest path cycles: 3.25 1.00 0.00 A + Shortest path cycles: 2.50 1.00 0.00 A Longest path cycles: N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, T = Texture [Mali-G78AE] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.45 0.00 0.12 0.25 A + Total instruction cycles: 0.27 0.00 0.25 0.25 A Shortest path cycles: 0.06 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G78] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.45 0.00 0.12 0.25 A + Total instruction cycles: 0.27 0.00 0.25 0.25 A Shortest path cycles: 0.06 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G77] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.45 0.00 0.12 0.25 A + Total instruction cycles: 0.27 0.00 0.25 0.25 A Shortest path cycles: 0.06 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G76] Main shader =========== - Work registers: 21 (65% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 16 (25% used) + Uniform registers: 2 (3% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 81% A LS V T Bound - Total instruction cycles: 1.08 0.00 0.12 0.50 A + Total instruction cycles: 0.83 0.00 0.25 0.50 A Shortest path cycles: 0.29 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= - Has uniform computation: true + Has uniform computation: false Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G72] Main shader =========== - Work registers: 21 (65% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 16 (25% used) + Uniform registers: 2 (3% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 81% A LS V T Bound - Total instruction cycles: 2.17 0.00 0.25 1.00 A + Total instruction cycles: 1.67 0.00 0.50 1.00 A Shortest path cycles: 0.58 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= - Has uniform computation: true + Has uniform computation: false Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G715] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 66% A LS V T Bound - Total instruction cycles: 0.23 0.00 0.03 0.12 A + Total instruction cycles: 0.15 0.00 0.06 0.12 A Shortest path cycles: 0.03 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G710] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.26 0.00 0.06 0.12 A + Total instruction cycles: 0.20 0.00 0.12 0.12 A Shortest path cycles: 0.03 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G71] Main shader =========== - Work registers: 19 (59% used at 100% occupancy) + Work registers: 20 (62% used at 100% occupancy) - Uniform registers: 20 (31% used) + Uniform registers: 2 (3% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 81% A LS V T Bound - Total instruction cycles: 2.00 0.00 0.25 1.00 A + Total instruction cycles: 1.58 0.00 0.50 1.00 A Shortest path cycles: 0.58 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= - Has uniform computation: true + Has uniform computation: false Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G68] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.45 0.00 0.12 0.25 A + Total instruction cycles: 0.27 0.00 0.25 0.25 A Shortest path cycles: 0.06 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G615] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 66% A LS V T Bound - Total instruction cycles: 0.23 0.00 0.03 0.12 A + Total instruction cycles: 0.15 0.00 0.06 0.12 A Shortest path cycles: 0.03 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G610] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.26 0.00 0.06 0.12 A + Total instruction cycles: 0.20 0.00 0.12 0.12 A Shortest path cycles: 0.03 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G57] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.45 0.00 0.12 0.25 A + Total instruction cycles: 0.27 0.00 0.25 0.25 A Shortest path cycles: 0.06 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G52] Main shader =========== - Work registers: 21 (65% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 16 (25% used) + Uniform registers: 2 (3% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 81% A LS V T Bound - Total instruction cycles: 1.08 0.00 0.12 0.50 A + Total instruction cycles: 0.83 0.00 0.25 0.50 A Shortest path cycles: 0.29 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= - Has uniform computation: true + Has uniform computation: false Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G510] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.34 0.00 0.06 0.12 A + Total instruction cycles: 0.26 0.00 0.12 0.12 A Shortest path cycles: 0.04 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G51] Main shader =========== - Work registers: 21 (65% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 16 (25% used) + Uniform registers: 2 (3% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 81% A LS V T Bound - Total instruction cycles: 2.17 0.00 0.12 0.50 A + Total instruction cycles: 1.67 0.00 0.25 0.50 A Shortest path cycles: 0.58 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= - Has uniform computation: true + Has uniform computation: false Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G310] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 80% A LS V T Bound - Total instruction cycles: 0.52 0.00 0.12 0.25 A + Total instruction cycles: 0.39 0.00 0.25 0.25 A Shortest path cycles: 0.06 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Mali-G31] Main shader =========== - Work registers: 21 (65% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 16 (25% used) + Uniform registers: 2 (3% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 81% A LS V T Bound - Total instruction cycles: 3.25 0.00 0.12 0.50 A + Total instruction cycles: 2.50 0.00 0.25 0.50 A Shortest path cycles: 0.88 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= - Has uniform computation: true + Has uniform computation: false Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false [Immortalis-G715] Main shader =========== - Work registers: 20 (62% used at 100% occupancy) + Work registers: 19 (59% used at 100% occupancy) - Uniform registers: 8 (12% used) + Uniform registers: 4 (6% used) Stack spilling: false - 16-bit arithmetic: 100% + 16-bit arithmetic: 66% A LS V T Bound - Total instruction cycles: 0.23 0.00 0.03 0.12 A + Total instruction cycles: 0.15 0.00 0.06 0.12 A Shortest path cycles: 0.03 0.00 0.00 0.00 A Longest path cycles: N/A N/A N/A N/A N/A A = Arithmetic, LS = Load/Store, V = Varying, T = Texture Shader properties ================= Has uniform computation: true Has side-effects: false Modifies coverage: false Uses late ZS test: false Uses late ZS update: false Reads color buffer: false