Turn off PROFILE option and remove WITH_PROFILE after FreeBSD 13?
Steve Kargl
sgk at troutmask.apl.washington.edu
Fri Jan 17 19:29:29 UTC 2020
On Fri, Jan 17, 2020 at 01:12:32PM -0500, Ed Maste wrote:
> On Fri, 17 Jan 2020 at 12:19, Steve Kargl
> <sgk at troutmask.apl.washington.edu> wrote:
> >
> > Why? Because adding -pg to the gfortran command line is sufficient
> > to getting profiling information for long running numerically
> > intensive codes. 'gfortran -pg', of course, loosk for libc_p.a
> > and libm_p.a.
>
> Have you tried sampling-based profiling (i.e., hwpmc)? I'm curious if
> it provides equal utility for you, or if there's some shortcoming.
Never needed to try hwpmc.
% gfortran9 -o z -pg fortran_file.f90
just works if libc_p.a and libm_p.a are present. There is a link-time
failure if the libraries are missing. Here's an example of using -pg
that found a bottleneck in my code (which I haven't profiled recently).
Each sample counts as 0.000123062 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
46.80 275.68 275.68 1178817696 0.00 0.00 __lum_MOD_cludet_dble
11.55 343.73 68.05 19458348 0.00 0.00 __sjnm_MOD_csjn_dble
7.09 385.47 41.73 19458348 0.00 0.00 __sphere_MOD_sphere_shell_formfcn
5.97 420.63 35.16 97291740 0.00 0.00 __sjnm_MOD_sjn_dble
3.84 443.24 22.61 23712564606 0.00 0.00 cabs (w_cabs.c:17 @ 4968f0)
The cludet_dble() routine is a bottleneck, which makes heavy use of cabs().
It so happens that cludet_dble doesn't need to use cabs, and instead can
look at the magnitude square. Replacing cabs(z) with creal(z)**2 + cimag(z)**2
gives
Each sample counts as 0.000123062 seconds.
% cumulative self self total
53.93 232.70 232.70 1178817696 0.00 0.00 __lum_MOD_cludet_dble
15.84 301.02 68.32 19458348 0.00 0.00 __sjnm_MOD_csjn_dble
10.63 346.91 45.88 19458348 0.00 0.00 __sphere_MOD_sphere_shell_formfcn
7.84 380.71 33.81 97291740 0.00 0.00 __sjnm_MOD_sjn_dble
Nominally, a 43 CPU seconds decrease. That 43 seconds accumulates quickly,
when the code is executed a few thousand times for Monte Carlo simulations.
Is there a trivially stupid way of using hwpmc that requires no changes
to fortran_file.f90?
PS: For those snickering about the word Fortran. Go read the Fortran 2018
standard and educate yourselves. You want document 007 from
https://j3-fortran.org/doc/standing.
--
Steve
More information about the freebsd-current
mailing list