summaryrefslogtreecommitdiff
path: root/runtime/trace_profile.cc
AgeCommit message (Collapse)Author
2025-03-18Fix iterating over long running method buffers when flushing Mythri Alle
When iterating over the long running buffers, we used to use unsigned indices and the way the loop was written it was expecting signed values. The loop exit check expected -1 but for unsigned this doesn't work. Fixed the check to not expect -1 and also used signed index. Drive-by-fix: Use correct type for unique_ptr. Bug: 352518093 Test: Manual test Change-Id: I8461919d7c20e3d76fa9e10783f8dd1f7ef950e0
2025-03-18Revert^4 "Use nanoseconds for v2 method tracing" Mythri Alle
This reverts commit 02a80d00ffd5eaa1985e18becd542579576720fc. Reason for revert: Fix dump_trace.cc to parse timestamps as 64-bit unsigned lebs and not 32-bit. These were always 64-bit fields but the bug didn't impact because timestamp diff did fit in 32-bits. When using ns, it no longer fits exposing the bug. Change-Id: If979c0bf891ae5bd0fdf278915fcda8ed09bfa9f
2025-03-07Revert^3 "Use nanoseconds for v2 method tracing" Mythri Alle
This reverts commit 84c60905008131ff35b93b9f8b64923e267f92a6. Reason for revert: Reverting because of failures on qemu.arm64. https://ci.chromium.org/ui/p/art/builders/ci/qemu.arm.64/202/overview Change-Id: I96eeb4b675eb36850f64e4f3102e55f456747762
2025-03-06Dump the event time in nanoseconds instead of timestamps Mythri Alle
Also fixes a bug where we were recording time instead of timestamp for methods currently on the stack when method tracing started. Bug: 352518093 Test: Manual testing Change-Id: I806559d88d3c7a839addef34d6ec86c3c923c325
2025-03-04Don't include thread information in long running method traces Mythri Alle
We want to keep the long running method traces as small as possible. So, don't include mapping of thread id to thread name in the traces. If required this information can be obtained from the stack traces that are included along with ANR report. Bug: 352518093 Test: manual testing Change-Id: Id09e9e1a3df9f34d87731996c4149f79d5cd319a
2025-02-27Change the thread / method info format for low overhead traces Mythri Alle
Since we use a binary format for the trace entries we need some header to differentiate the trace events from the thread / method infos. This CL changes it to use the same format as the regular method traces. Bug: 352518093 Test: Manual testing Change-Id: I27407f52cf3c1a06cb6bf6a8accd825b508005aa
2025-02-26A few fixes when dumping long running traces Mythri Alle
This CL has a few fixes: 1. When recording on stack methods at the start of the trace, the current buffer index wasn't set correctly. It was set to the buffer size which basically means there are no entries. Fixed by setting it to the correct index. 2. We have some placeholder entries which should be ignored when dumping the trace. 3. We should encode the eventy type as a first value. Based on the event type we read the next value only for method entry events. 4. Encode the number of bytes written for each block of events, so it is easy when parsing. 5. Don't remove the event type when recording the timestamp_and_action. We need to clear the lsb to get the timestamp but when writing the entry we encode both timestamp and the event type as a single value. Bug: 352518093 Test: Manual testing Change-Id: Ibe72ad82daa37eb2f4760decbb207663836796c7
2025-02-20Use a more concise binary format when dumping long running methods Mythri Alle
We used to use a human readable format to dump long running methods. To make the dumps more concise we are moving to a binary format that cuts the size by a 3rd. A trace that was about 60K earlier is now about 20K. The format is very similar to the method trace that uses leb encoding and encodes the incremental difference from the previous entry both for method ids and the timestamps. Bug: 352518093 Test: Manual testing with feature enabled and dumping to file and also to an ANR report. Change-Id: Ief407dc7dc6f335b0d5add2a1e0f439ad01e318e
2025-02-18Fix lock level violations when dumping low overhead traces Mythri Alle
There were two lock level violations in the code that dumps low overhead traces 1. We should not wait for checkpoints to finish while holding the trace_lock_. This CL fixes it by waiting for the checkpoints after releasing the trace_lock_. Since we release trace_lock_ while waiting we need additional synchronization to make sure trace_data_ isn't deleted while checkpoints are still running. This is done using a ConditionVariable and waiting on it if a trace dump is in progress before deleting the trace_data_. 2. We should not enter a ScopedObjectAccess (that acquires mutator lock) while holding the bottom level trace_data_lock_. We now take a snapshot of the traced_methods and traced_threads when holding the trace_data_lock_ and then dump thread and method information outside the lock. Dumping methods requires mutator lock to get the information about the method. Bug: 352518093 Test: Manual testing with feature enabled and kDebugLocking enabled. Change-Id: I42a61f4d43eb985a93bc7884c09f771a5fbcd8f2
2025-02-18Don't use suspend all scope when dumping low overhead traces Mythri Alle
We used to use SuspendAll scope when dumping the low overhead traces for all methods. It is better to use Checkpoints instead of suspend all scope. Dumping data is reasonably large and suspending all threads for that long could cause timeouts and ANRs. Bug: 352518093 Test: Manual testing Change-Id: I42af633f776a56e162bafaa14b25b881697b7362
2025-02-18Update the implementation of lowoverhead tracing start Mythri Alle
This CL changes the implementation to: 1. Allow concurrent starts for long running methods. Long running method traces run for a specified duration. We allow another trace to start while the current one is progress. One use case for long running method traces is to provide better information for analysing ANR reports. Since there can be overlapping events that we are interested in, it is important to allow concurrent starts for long running methods. When a start is requested while a trace is in progress, we just update the end time of the trace to cover both requests. For the request that started later, we also see the history from the earlier start. 2. We no longer wait for the checkpoints to finish when starting the trace. It is not required to wait for the checkpoints to finish. The checkpoints are run in order so even in cases where the start-stop-start sequence happened and the thread hasn't run any of them, it will run them in order whenever it hits a suspend point. There are two things to note here: 1. The trace might start at slightly different times based on when the checkpoints are run but that is okay. That is the earliest time when we can start the checkpoints anyway. 2. The RunCheckpoint happens inside a trace_lock_. RunCheckpoint runs the checkpoints of the suspended threads before returning. So it might take a little longer to finish RunCheckpoint function but that's also acceptable since trace_lock_ is only used when a thread is exiting and other cases that are not performance critical. Bug: 352518093 Test: Manual testing using maps app Change-Id: Id4f415b9393f5f21aa7828a632bf023583a45c25
2025-02-14Initialize the buffer for longrunning method traces Mythri Alle
When allocating a buffer for longrunning method traces, add a placeholder exit event. This is required to prevent additional checks in the method exit stubs to check if the previous entry is valid. Bug: 352518093 Test: Manual testing with am profile Change-Id: Iba18cc9d4769cb9aebe12c389a5c017629cbdbf2
2025-02-11Revert^2 "Use nanoseconds for v2 method tracing" Mythri Alle
This reverts commit e8ea3a17d66636e6cc6a395b967e22aac8761e67. Reason for revert: Upload after potential fix for the failures. We used to use 32-bit entries for storing threadcpu time. With nanosecond precision this overflows in about 4 seconds. On gcstress configuration this was potentially overflowing causing the failures. The fix now uses 64-bit entries for threadcpu time. Wall clock time was already using 64-bit entries. Change-Id: I819fe45e0a465e5a5b037248a4cd7107acb5130d
2025-01-24Revert "Use nanoseconds for v2 method tracing" Mythri Alle
This reverts commit 1a9a8abf19e7641149fcaed3971bf0aa99f4dfb8. Reason for revert: causes flakes on art-jit-on-first-use-gcstress Change-Id: I38161f65e2fc584e5e6221f85982035fb66c13ae
2025-01-24Use nanoseconds for v2 method tracing Mythri Alle
Update the precision of timestamps to be nanoseconds and use nanosecond precision in method trace format v2. For format v1 we use microsecond precision for backwards compatability. Bug: 259258187 Test: art/test.py -t 2246 Change-Id: I099e21f5d143c5444193e7886b5281b32f87d5d3
2025-01-23Avoid using trace_data_lock_ in TraceStartCheckpoint Mythri Alle
We used to record methods currently on thread stack into a string in trace_data_. This required using a lock to synchronize across different threads. This CL changes this by recording this information into the per-thread buffer avoiding the need of locks. Since we record them in the per-thread buffer we would have fewer free spaces in the buffer. We don't expect this to be a big problem. Bug: 352518093 Test: art/test.py Change-Id: Ia9960456a17cfabf5ff6680925f4243b8bb892fb
2025-01-23Don't use SuspendAllScope for starting and dumping low overhead trace Mythri Alle
We do a significant amount of work when starting / dumping low overhead traces. It is not good to suspend all threads for such a long time. This CL changes the implementation to use checkpoints and wait for all the threads to run the checkpoint before proceeding. Bug: 352518093 Test: art/test.py Change-Id: I7464e89c1d9fb12c1e6aad6cc3ca2d3a35a27da7
2025-01-21Add support for collecting long running methods Mythri Alle
This CL extends low overhead tracing to support collecting long running methods. This feature is disabled by default and is implemented behind always_enable_profiling_code flag. Turning on the flag enables the support but doesn't start tracing. Tracing has to be started via the StartTraceLongRunningMethods interface in VMDebug. If a trace was running the collected data is included in the data generated by the SIGQUIT handlers. This would help when analyzing ANR reports. The trace can also be dumped using the API provided in VMDebug. Bug: 352518093 Test: art/test.py Change-Id: I407560c45cd31bfa06906bb2286c455eef7ceec8
2024-10-17Add method and thread infos when dumping low-overhead traces Mythri Alle
We missed adding information about method and threads when generating low-overhead traces. This collects and dumps this information to the end of the trace. This also does a minor optimization when encoding method exits by either encoding 0 or 1 to reduce the number of bytes needed. Bug: 352518093 Test: Manual testing Change-Id: I62adcbc0738480f5666bb41089e3a5bc7d93d126
2024-10-14Few fixes for low overhead method tracing Mythri Alle
There are a few fixes required for low overhead method tracing: 1. Release the allocated buffer when thread is being destroyed. 2. Allocate a new buffer on thread creation if tracing is in progress. 3. Flush the buffer and close the file oncer iterating over all threads is done. 4. Iterate trace buffer entries from the end of the buffer. We stores entries starting from the end, so flush them in the same order. Bug: 352518093 Test: Manual testing Change-Id: Icc7a58902a2bbc96efc98b8635ba1e04a62a372b
2024-09-26Release trace buffer allocated by lowoverhead tracing on thread exit Mythri Alle
If a thread is exiting we should cleanup the trace buffers allocated. For now, we just release the allocated buffer. In future maybe we want to cache it to dump the data. Bug: 352518093 Test: Manual testing - this feature is disabled by default Change-Id: I302298491317ebe05b95ab36f565074d24bc2332
2024-09-11Adds an interface to collect on demand art traces Mythri Alle
This CL adds implementation of interface to collect dex method traces when requested. This is different from method tracing. This doesn't provide precise traces and uses a circular buffer so it only contains a fixed number of latest entries based on the size of buffer. Bug: 352518093 Test: art/test.py Change-Id: If3e7935e1e99eabeef0dc448d2ba3541d0da650d