NVIDIA’s Tools Extension library is an easy way to add profiling information to your code if you use their NSight profiler. Beyond the simple example shown here, timers can be colored, set to a specific version and there are special language specific functions for CUDA and OpenCL. See the reference below for more information.
The example here is extremely simply in nature but shows how NVTX can be used to create a lightweight scoped timer. The idea of a scoped timer useful because it allows you to time a section of code and when the timer loses scope it will stop itself automatically by calling its destructor.
Timing data will automatically show up when enabled in NSight.