Cool project, lightweight and easy to drop into a codebase I have a few ideas.
1. Zone name storage – right now you store a raw char. If it points to a stack/local string it can blow up . Better to use const char (if always literals) or strdup.
2. Thread-safety – the global arrays (prof_zones, prof_zones_stack) aren’t protected. In multithreaded workloads everything breaks. You could make them thread_local.
3. Cycles vs time – __rdtsc() isn’t always stable (TurboBoost, NUMA, CPU scaling). clock_gettime() is more portable. Might be nice to let the user pick via a macro.
4. Output – the printf("[%16s]...") is old-school. Sorting zones by total_secs or total_cycles would make hot spots pop out immediately.
5. Limits – PROF_MAX_NUM_ZONES = 256 is small for bigger projects. Could malloc and grow dynamically.
6. Overhead – the profiler doesn’t subtract its own overhead (push/pop, clock reads), so absolute numbers are inflated. Relative numbers are still valid, though.
Cool project, lightweight and easy to drop into a codebase I have a few ideas.
1. Zone name storage – right now you store a raw char. If it points to a stack/local string it can blow up . Better to use const char (if always literals) or strdup.
2. Thread-safety – the global arrays (prof_zones, prof_zones_stack) aren’t protected. In multithreaded workloads everything breaks. You could make them thread_local.
3. Cycles vs time – __rdtsc() isn’t always stable (TurboBoost, NUMA, CPU scaling). clock_gettime() is more portable. Might be nice to let the user pick via a macro.
4. Output – the printf("[%16s]...") is old-school. Sorting zones by total_secs or total_cycles would make hot spots pop out immediately.
5. Limits – PROF_MAX_NUM_ZONES = 256 is small for bigger projects. Could malloc and grow dynamically.
6. Overhead – the profiler doesn’t subtract its own overhead (push/pop, clock reads), so absolute numbers are inflated. Relative numbers are still valid, though.
Bonus idea: add a macro like
#define PROF_SCOPE(name) for (int _i = (prof_begin(name), 0); !_i; prof_end(), _i++)
so you can just write:
void foo() { PROF_SCOPE("foo"); // code... }
Thanks Gemini...