What is _kmp_fork_barrier and how to see if there is load imbalance?

I'm using Intel VTune Amplifier to see how my parallel application scales.

It scales pretty well on my 4-cores laptop (considering that there are portions of the algorithm that can't be parallelized):

However, when I test it on the Knights Landing (KNL), it scales horribly:

Notice that I'm using only 64 cores on purpose.

Why there is so much idle time? And what is _kmp_fork_barrier? Reading about "Imbalance or Serial Spinning (OpenMP)" it seems that this is about load imbalance, but I'm already using schedule(dynamic,1) in all omp regions.

How can I see if this is actually load imbalance? Otherwise, what could be a possible cause?

Notice I have 3 parallel omp parallel regions:

#pragma omp parallel for collapse(2) schedule(dynamic,1)

#pragma omp declare reduction(mergeFindAffineShapeArgs : std::vector<FindAffineShapeArgs> : omp_out.insert(omp_out.end(), omp_in.begin(), omp_in.end()))
#pragma omp parallel for collapse(2) schedule(dynamic,1) reduction(mergeFindAffineShapeArgs : findAffineShapeArgs)

#pragma omp declare reduction(mergeFindAffineShapeArgs : std::vector<FindAffineShapeArgs> : omp_out.insert(omp_out.end(), omp_in.begin(), omp_in.end()))
#pragma omp parallel for collapse(2) schedule(dynamic,1) reduction(mergeFindAffineShapeArgs : findAffineShapeArgs)

Is it possible that this is because of the reduction? I knew that it was pretty efficient (using a divide-et-impere merge approach).

This is the bottom-up section:

See here how the most expensive functions are well parallelized (most of them):

What is _kmp_fork_barrier and how to see if there is load imbalance?

Trending Articles

ZARIA CUMMINGS

BREAKING NEWS: Early success in Chinn appeal bid

Black Angus Grilled Artichokes

Michel Roux roast duck with cherries, cherry sauce and potatoes recipe on...

Mtu mwenye Div four ya 26,unaweza kusomea nini??

TO: TIA PARMETER AND CORY GROU...

Kumbalangi Nights - English (1CD ) - subtitles

Sheila Mwanyigha Biography, Boyfriend,Marriage and Tribe

Wutah – Kotosa ( Prod by Appietus ) ThrowBack

99 Rain Status for Whatsapp - Best Rain Dp Collection

Practice Sheet of Right form of verbs for HSC Students

LEGO® Marvel Avengers + DLCs [US]

m-flo loves ZICO, eill – EKO EKO – Single [iTunes Plus M4A]

CC1310: FCC ID Help

Police arrest and charge wanted man Ryan Griffin

VMOU RSCIT Result 2017, RSCIT Result VMOU rkcl.vmou.ac.in Name Wise

Autodesk AutoCAD 2015 Portable (Win64)

BigXthaPlug – TAKE CARE (DELUXE) [iTunes Plus M4A]

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana