.. _utils_profiler:
.. index::
single: Utils; profiler
single: Utils; Profiling
single: Utils; Memory leak tracking
================================
Profiler --- Runtime Profiling
================================
The daslang profiler is an **instrumenting** profiler implemented as a debug
agent in :doc:`../../stdlib/generated/profiler`. It has two modes:
* **Performance profiling** --- per-function wall-clock timing with optional
per-function heap accounting. Emits a Chrome-compatible trace JSON for
visualization in ``chrome://tracing`` or Perfetto.
* **Memory-leak tracking** --- records every live heap allocation with its
captured daslang call stack, and on context destroy dumps the leaked
allocations sorted by size (largest first).
Both modes install a ``DapiDebugAgent`` that hooks into the daslang
runtime; there is no separate profiler binary. The agent is started
automatically when you run a script with one of the ``--das-profiler*``
command-line flags, or explicitly via ``require daslib/profiler`` in the
script itself.
.. contents::
:local:
:depth: 2
Quick start --- performance profiler
=====================================
Run any daslang script with ``--das-profiler`` and optionally a log-file path::
daslang --das-profiler --das-profiler-log-file /tmp/trace.json path/to/script.das
Open the resulting ``/tmp/trace.json`` in ``chrome://tracing`` or
`Perfetto UI `_ to explore the call tree. A tree
summary is also written to the log (``LOG_INFO``) on context destroy, for
example::
main 1 71900ns
builtin`push`4379756157886752001 ... 100 13000ns
builtin`finalize`4179837999686245486 ... 1 700ns
Each row is `` ``, indented to
reflect the call tree.
Quick start --- memory-leak tracker
====================================
Run with ``--das-profiler-leaks`` (mutually exclusive with the performance
profiler)::
daslang --das-profiler --das-profiler-leaks path/to/script.das
On context destroy the tracker prints every live allocation with its captured
call stack, sorted by size::
=== Memory leaks in context '' (3 allocations, 0x2b0 bytes) ===
[leak] size=0x200 bytes
at builtin`push`4379756157886752001 daslib/builtin.das:117
at make_big_leak examples/leak_smoke.das:14
at main examples/leak_smoke.das:37
[leak] size=0x80 bytes
at builtin`reserve`12130697888660093679 daslib/builtin.das:84
at make_widget_leak examples/leak_smoke.das:24
at main examples/leak_smoke.das:37
[leak] size=0x30 bytes
at make_widget_leak examples/leak_smoke.das:24
at main examples/leak_smoke.das:37
Stack frames are in leaf-first order (``panic()`` convention). The
``file:line`` portion is standalone so the VSCode terminal turns it into a
clickable link that jumps to the source line.
If the program has multiple live contexts at shutdown (e.g. an audio thread
spawned by strudel), each one produces its own ``=== Memory leaks in
context '' ===`` block.
Enabling the profiler
=====================
There are two supported ways to install the profiler agent:
Auto-install via ``--das-profiler``
-----------------------------------
Passing any ``--das-profiler*`` CLI flag to ``daslang.exe`` implicitly
injects ``daslib/profiler.das`` into your program and runs the agent's
``[_macro] installing`` hook, which forks a debug-agent context and installs
the appropriate agent. You do not need to ``require`` it in the script.
Manual ``require``
------------------
Alternatively, add ``require daslib/profiler`` to your script. The
auto-installing macro runs at compile time for the ``profiler`` module and
installs the agent the same way. The CLI flags are still consulted at
runtime to choose the mode. Use ``require daslib/profiler_boost`` as well
if you want to call ``set_enable_profiler`` from script code
to gate collection around a region of interest (``disable_profiler()`` /
``enable_profiler()``).
Command-line flags
==================
.. list-table::
:header-rows: 1
:widths: 35 65
* - Flag
- Meaning
* - ``--das-profiler``
- Required prefix flag that auto-requires :doc:`../../stdlib/generated/profiler`.
* - ``--das-profiler-log-file ``
- Write the Chrome trace JSON (performance mode) or the leak report
(leaks mode) to ````. If omitted, the performance tree summary
and leak report go to ``to_log(LOG_INFO, ...)``.
* - ``--das-profiler-manual``
- Performance mode: start with collection **disabled**, so nothing is
recorded until you call ``enable_profiler`` from script
code (via ``daslib/profiler_boost``). Useful for profiling one hot
region of a longer run.
* - ``--das-profiler-memory``
- Performance mode: also record per-function heap and string-heap
accounting. The tree-summary report then shows ``heap=N string_heap=M``
instead of timings, plus two "Top 10 offenders" tables. Implies
``--das-profiler-global`` unless overridden.
* - ``--das-profiler-time-unit ``
- Performance mode: time unit for the tree summary. Default ``ns``. The
Chrome trace JSON always uses microseconds (unchanged).
* - ``--das-profiler-thread-local``
- Performance mode: install one agent per thread (default when not
tracking memory). Instrumentation events are dispatched only to the
current thread's debug agent.
* - ``--das-profiler-global``
- Performance mode: install a single named agent observing all threads.
Default when ``--das-profiler-memory`` is set.
* - ``--das-profiler-leaks``
- Install the memory-leak tracker instead of the performance profiler.
Always a singleton named ``"memleaks"``; observes allocations from
every live context.
Performance mode details
========================
The performance profiler wraps every daslang function body in an
instrumentation node (:cpp:class:`SimNodeDebug_InstrumentFunction` or its
thread-local variant). Each call fires ``onInstrumentFunction(entering, ...)``
on the agent, which records a timestamp (and optionally a heap snapshot)
into a per-context event buffer. On context destroy the events are folded
into a call tree and dumped.
Chrome trace JSON
-----------------
When ``--das-profiler-log-file`` is given, each event becomes a
Chrome-tracing *begin* (``"ph":"B"``) or *end* (``"ph":"E"``) entry in the
JSON array. Thread IDs are synthesized from context pointer addresses so
each daslang context shows as a separate track. Open the file in
``chrome://tracing`` (Chromium-based browsers) or Perfetto.
Gating collection around a region
---------------------------------
For long-running programs you usually want to profile only a specific
phase. Combine ``--das-profiler-manual`` with ``daslib/profiler_boost``:
.. code-block:: das
require daslib/profiler_boost
[export]
def main() {
warm_up()
enable_profiler(this_context())
hot_region()
disable_profiler(this_context())
cool_down()
}
With ``--das-profiler-manual`` the profiler starts in the disabled state,
so only ``hot_region`` is recorded. Without ``--das-profiler-manual`` the
profiler starts enabled and the ``disable/enable`` pair toggles collection
off then back on (handy for *excluding* a region).
Per-function heap accounting
----------------------------
With ``--das-profiler-memory`` the tree summary shows each function's
inclusive and *own* (self - children) heap and string-heap allocation
totals. Two top-10 tables follow, ranking functions by own heap and own
string-heap byte totals. This mode implies ``--das-profiler-global``
because the accounting is aggregated across contexts.
Memory-leak mode details
========================
The leak tracker subscribes to four runtime callbacks:
* ``onAllocate(ctx, ptr, size, at)`` --- record a new allocation, keyed by
``intptr(ptr)``; store the current per-context shadow call stack.
* ``onReallocate(ctx, old, oldSize, new, newSize, at)`` --- erase ``old``,
insert ``new`` with ``newSize`` (the realloc site becomes the new home
of the block).
* ``onFree(ctx, ptr, at)`` --- erase the record.
* ``onInstrumentFunction(ctx, fn, entering, _)`` --- maintain a shadow
``array`` stack per instrumented context. No event buffer,
no timing --- push on entry, pop on exit. Used at allocation time to
snapshot the stack with a single array clone.
On context destroy the tracker emits the report shown in the quick start.
The report is routed to the log file if ``--das-profiler-log-file`` is set,
otherwise to ``to_log(LOG_INFO, ...)``.
Multi-context programs
----------------------
The leak agent is installed as a named singleton, so a single report is
produced covering every live context (main thread, spawned threads, any
sub-contexts such as the strudel audio mixer or job-queue workers). Each
context's allocations go into their own bookkeeping table keyed by the
context's address, and each ``onDestroyContext`` emits that context's
block of the final report.
What is **not** tracked
-----------------------
* Allocations made before the agent finishes installing. The
``[_macro] installing`` hook runs during compile time, which is early
enough for main-thread user code and any threads the script spawns
afterwards, but not for the compiler's own macro/folding contexts ---
those are out of scope by design.
* String-heap allocations (``onAllocateString`` / ``onFreeString``). The
leak agent only hooks the raw heap. String-heap leaks still show up in
the C++-side heap tracker if enabled.
* Allocations in contexts where ``instrumentAllocations`` has been
manually disabled via ``instrument_context_allocations``
after the agent enabled it.
Writing your own profiler agent
================================
The leak and performance agents both inherit from
``ProfilerBaseAgent``, which in turn inherits from
``DapiDebugAgent`` in :doc:`../../stdlib/generated/debugapi`.
You can write your own agent by subclassing either.
Subclassing the profiler base
-----------------------------
Use this when you want to piggy-back on the CLI-option parsing and the
per-code-allocator instrumentation dedup logic:
.. code-block:: das
require daslib/profiler
class MyAgent : ProfilerBaseAgent {
def override onInstall(agent : DebugAgent?) : void {
use_thread_local = false // or true, depending on your needs
}
def override onCreateContext(var ctx : Context) : void {
if (!isProfileable(ctx)) {
return
}
ensure_instrumented(ctx) // installs onInstrumentFunction hooks
instrument_context_allocations(ctx, true) // enables alloc hooks
// ... your per-context state setup
}
// ... your onAllocate / onInstrumentFunction / onDestroyContext
}
Subclassing ``DapiDebugAgent`` directly
------------------------------------------------
Use this when you don't need any of the profiler's scaffolding and just
want raw access to the debug-agent hooks. See
:doc:`../../stdlib/generated/debugapi` for the complete list of
overridable methods. The ``examples/debugapi/`` directory in the source
tree contains worked examples, including ``allocation_tracking.das``
which shows the minimal allocation-hook setup.
The key API calls your agent will use:
* ``install_new_debug_agent(agent, "category")`` --- install as a named
singleton. Allocation hooks and function-instrumentation hooks reach
named agents only when the instrumentation uses the non-thread-local
variant (``instrument_all_functions(ctx)`` and the fact that
``Context::onAllocate`` dispatches via ``for_each_debug_agent``).
* ``install_new_thread_local_debug_agent(agent)`` --- install in the
current thread's thread-local slot (one agent per thread). Receives
events from every thread-local instrumentation variant on the same
thread.
* ``instrument_all_functions(ctx)`` vs
``instrument_all_functions_thread_local(ctx)`` --- pick the variant
matching how your agent is installed, otherwise the
``onInstrumentFunction`` callbacks never reach you. The profiler's
``ProfilerBaseAgent::ensure_instrumented`` picks correctly from the
``use_thread_local`` field.
* ``instrument_context_allocations(ctx, true)`` --- opt-in to
``onAllocate`` / ``onReallocate`` / ``onFree`` callbacks for that
context. Without this, no allocation hooks fire regardless of how the
agent is installed.
Performance impact
==================
Every instrumented function call pays the cost of two dispatches through
the debug-agent adapter (entry + exit). Every heap allocation in an
instrumented context pays one more dispatch. For the performance mode this
is usually 5-15% overhead depending on call density; for leak mode it is
higher because the shadow stack is cloned into an
``AllocationRecord`` on each ``onAllocate``. Neither mode is suitable for
shipping builds --- they are debug/diagnostic tools.
AOT-compiled functions that bypass instrumentation entirely (e.g. native
C++ shims without daslang stubs) are invisible to the profiler. Pure
daslang functions always show up when their context is instrumented.
See also
========
* :doc:`../../stdlib/generated/profiler` --- generated API reference for
the ``profiler`` module (classes, structs, helpers).
* :doc:`../../stdlib/generated/profiler_boost` --- cross-context enable
/ disable helpers used from user code.
* :doc:`../../stdlib/generated/debugapi` --- C++ ``DapiDebugAgent`` binding
that the profiler subclasses.
* ``examples/debugapi/allocation_tracking.das`` --- minimal example of a
custom allocation-tracking debug agent.