Skip to content

Allow to do gemv and ger buffer allocation on the stack#482

Merged
xianyi merged 1 commit intoOpenMathLib:developfrom
jeromerobert:develop
Jan 1, 2015
Merged

Allow to do gemv and ger buffer allocation on the stack#482
xianyi merged 1 commit intoOpenMathLib:developfrom
jeromerobert:develop

Conversation

@jeromerobert
Copy link
Contributor

ger and gemv call blas_memory_alloc/free which in their turn
call blas_lock. blas_lock create thread contention when matrices
are small and the number of thread is high enough. We avoid
call blas_memory_alloc by replacing it with stack allocation.
This can be enabled with:
make -DMAX_STACK_ALLOC=2048
The given size (in byte) must be high enough to avoid thread contention
and small enough to avoid stack overflow.

Fix #478

ger and gemv call blas_memory_alloc/free which in their turn
call blas_lock. blas_lock create thread contention when matrices
are small and the number of thread is high enough. We avoid
call blas_memory_alloc by replacing it with stack allocation.
This can be enabled with:
make -DMAX_STACK_ALLOC=2048
The given size (in byte) must be high enough to avoid thread contention
and small enough to avoid stack overflow.

Fix OpenMathLib#478
xianyi added a commit that referenced this pull request Jan 1, 2015
Allow to do gemv and ger buffer allocation on the stack
@xianyi xianyi merged commit 41aad04 into OpenMathLib:develop Jan 1, 2015
xianyi added a commit that referenced this pull request Apr 13, 2015
For gemv_t, directly use malloc to create the buffer.
xianyi added a commit that referenced this pull request Apr 13, 2015
jeromerobert added a commit to jeromerobert/OpenBLAS that referenced this pull request Apr 15, 2015
jeromerobert added a commit to jeromerobert/OpenBLAS that referenced this pull request Apr 15, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Very slow when having many small matrices and many threads

2 participants