Wilhel

Wilhel

Debugging and Performance Analysis

Debugging Code#

Print Debugging and Logging#

  • You can write logs to files, sockets, or even send them to remote servers instead of just standard output;
  • Logs can support severity levels (such as INFO, DEBUG, WARN, ERROR, etc.), allowing you to filter logs as needed;
  • For newly discovered issues, your logs may already contain enough information to help you locate the problem.

Third-Party Logging Systems#

  • Unix /var/log dmesg
  • Linux systemd /var/log/journal journalctl
  • MacOS /var/log/system.log log show
  • logger writes logs to the system log
logger "Hello Logs"
# On macOS
log show --last 1m | grep Hello 
# On Linux 
journalctl --since "1m ago" | grep Hello
  • lnav log viewer, a better way to present and browse logs

Debugger#

ipdb enhanced pdb Python debugger

  • l(ist) - Displays 11 lines around the current line or continues the previous display;
  • s(tep) - Executes the current line and stops at the first possible place;
  • n(ext) - Continues execution until the next statement in the current function or the return statement;
  • b(reak) - Sets a breakpoint (based on the passed parameters);
  • p(rint) - Evaluates the expression in the current context and prints the result. There is also a command pp that uses pprint for printing;
  • r(eturn) - Continues execution until the current function returns;
  • q(uit) - Exits the debugger.

For lower-level programming languages, you may want to look into gdb (and its improved version pwndbg) and lldb.

They are optimized for debugging C-like languages, allowing you to explore any process and its machine state: registers, stack, program counter, etc.

Specialized Tools#

Tracing system calls during program execution

  • Linux strace
  • MacOS BSD dtrace, wrapped with dtruss to provide an interface similar to strace
# On Linux
sudo strace -e lstat ls -l > /dev/null
4
# On macOS
sudo dtruss -t lstat64_extended ls -l > /dev/null

Network packet analysis tools: tcpdump Wireshark

Chrome/Firefox Developer Tools

  • Source code - View the HTML/CSS/JS source code of any site;
  • Live modification of HTML, CSS, JS code - Modify the content, style, and behavior of a website for testing (from this point, you can also see that webpage screenshots are unreliable);
  • Javascript shell - Execute commands in JS REPL;
  • Network - Analyze the timeline of requests;
  • Storage - View Cookies and local application storage.

Static Analysis#

Analyzing the source code of a program as input and reasoning about the correctness of the code based on coding rules

Python: pyflakes mypy

Shell scripts: shellcheck

code linting style checks or security checks

Vim: ale syntastic

Python: pylint pep8 style checks bandit security checks

For developers of other languages, static analysis tools can refer to this list: Awesome Static Analysis (you might be interested in the Writing section). For linters, you can refer to this list: Awesome Linters.

Performance Analysis#

Timing#

  • Real time - The actual time lost from the start to the end of the program, including the execution time of other processes and the time spent blocking (e.g., waiting for I/O or network);
  • User - The time spent by the CPU executing user code;
  • Sys - The time spent by the CPU executing system kernel code.
$ time curl https://missing.csail.mit.edu &> /dev/null
real    0m2.561s
user    0m0.015s
sys     0m0.012s

Performance Analysis Tools (profilers)#

CPU#

There are two types of CPU performance analysis tools: tracing profilers and sampling profilers. Tracing profilers record every function call of the program, while sampling profilers periodically monitor (usually every millisecond) your program and record the program stack.

Memory#

For languages like C or C++, memory leaks can cause your program not to release memory after use. To address memory-related bugs, we can use tools like Valgrind to check for memory leak issues.

For languages like Python that have garbage collection mechanisms, memory profilers are also useful because an object will not be collected as long as there are pointers still pointing to it.

Event Analysis#

When using strace to debug code, you may want to ignore some specific code and treat it as a black box during analysis. The perf command abstracts the distinctions of the CPU; it does not report time and memory consumption but reports system events related to your program.

For example, perf can report poor cache locality, a large number of page faults, or livelocks. Here’s a brief introduction to common commands:

  • perf list - Lists events that can be tracked by perf;
  • perf stat COMMAND ARG1 ARG2 - Collects events related to a process or instruction;
  • perf record COMMAND ARG1 ARG2 - Records sampling information of command execution and stores statistics in perf.data;
  • perf report - Formats and prints data from perf.data.

Visualization#

When using profilers to analyze real programs, the output will contain a lot of information due to the complexity of the software. Humans are visual creatures and are not very good at reading large amounts of text. Therefore, many tools provide the ability to visualize profiler output.

For sampling profilers, a common way to display CPU analysis data is the flame graph, which shows the function call relationships on the Y-axis and the proportion of time spent on the X-axis. Flame graphs are also interactive, allowing you to drill down into a specific part of the program and view its stack trace.

Call graphs and control flow graphs can show the relationships between subroutines, treating functions as nodes and function calls as edges. When used together with profiler information (such as call counts, time spent, etc.), call graphs become very useful for analyzing program flow. In Python, you can use pycallgraph to generate these images.

Resource Monitoring#

  • General Monitoring - The most popular tool is htop, which is an improved version of top. htop can display various statistics of currently running processes. htop has many options and shortcuts, commonly including: <F6> for process sorting, t to display a tree structure, and h to open or collapse threads. You might also want to check out glances, which has a similar implementation but a better user interface. If you need to aggregate measurements across all processes, dstat is also a very useful tool that can calculate metrics for different subsystems in real-time, such as I/O, network, CPU utilization, context switches, etc.;
  • I/O Operations - iotop can display real-time I/O usage information and conveniently check if a process is performing a lot of disk read/write operations;
  • Disk Usage - df can display information for each partition, while du can show the disk usage of each file in the current directory (disk usage). The -h option makes the command display data in a more human-friendly format; ncdu is a more interactive version of du, allowing you to navigate and delete files and folders in different directories;
  • Memory Usage - free can display the currently free memory in the system. Memory can also be displayed using tools like htop;
  • Open Files - lsof can list information about files opened by processes. This command is very useful when we need to see which process has opened a specific file;
  • Network Connections and Configuration - ss helps us monitor the sending and receiving of network packets and displays information about network interfaces. A common use case for ss is to find out which process is using a port. To display routing, network devices, and interface information, you can use the ip command. Note that the netstat and ifconfig commands have been replaced by the aforementioned tools.
  • Network Usage - nethogs and iftop are excellent interactive command-line tools for monitoring network usage.

If you want to test these tools, you can use the stress command to artificially increase the load on the system.

Dedicated Tools#

Sometimes, you just need to benchmark a black box program and evaluate software choices based on that. Command-line tools like hyperfine can help you quickly benchmark.

Like debugging, browsers also contain many excellent performance analysis tools that can be used to analyze page loading, allowing us to figure out where time is being spent (loading, rendering, scripts, etc.). More information about Firefox and Chrome can be found by clicking the links.

After-Class Exercises#

Debugging#

  1. Use the journalctl command on Linux or the log show command on macOS to obtain the login information of the superuser and the commands executed in the last day. If you cannot find relevant information, you can execute some harmless commands, such as sudo ls, and then check again.

  2. Study this pdb practical tutorial and familiarize yourself with the relevant commands. For more in-depth information, you can refer to this tutorial.

  3. Install shellcheck and try to check the following script. What problems does this code have? Please fix the relevant issues. Install a linter plugin in your editor so that it can automatically display relevant warning messages.

    #!/bin/sh
    ## Example: a typical script with several problems
    for f in $(ls *.m3u)
    do
      grep -qi hq.*mp3 $f \
        && echo -e 'Playlist $f contains a HQ file in mp3 format'
    done
    
  4. (Advanced) Please read Reverse Debugging and try to create a working example (using rr or RevPDB).

Performance Analysis#

  1. Here are some implementations of sorting algorithms. Please use cProfile and line_profiler to compare the performance of insertion sort and quicksort. Where are the bottlenecks for both algorithms? Then use memory_profiler to check memory consumption; why is insertion sort better? Then take a look at the in-place sorting version of quicksort. Bonus: Use perf to see the loop counts and cache hits and misses for different algorithms.

  2. Here is some Python code for calculating Fibonacci numbers, which defines a function for calculating each number:

    #!/usr/bin/env python
    def fib0(): return 0
    
    def fib1(): return 1
    
    s = """def fib{}(): return fib{}() + fib{}()"""
    
    if __name__ == '__main__':
    
        for n in range(2, 10):
            exec(s.format(n, n-1, n-2))
        # from functools import lru_cache
        # for n in range(10):
        #     exec("fib{} = lru_cache(1)(fib{})".format(n, n))
        print(eval("fib9()"))
    

    Copy the code into a file to make it an executable program. First, install pycallgraph and graphviz (if you can execute dot, it means GraphViz is installed). Use pycallgraph graphviz -- ./fib.py to execute the code and view the pycallgraph.png file. How many times was fib0 called? We can optimize it using memoization. Uncomment the commented part and regenerate the image. How many times was each fibN function called this time?

  3. We often encounter situations where a port we want to listen to is already occupied by another process. Let's find the corresponding process by the PID of the process. First, execute python -m http.server 4444 to start a simple web server listening on port 4444. In another terminal, execute lsof | grep LISTEN to print out all processes listening on ports and their corresponding ports. Find the corresponding PID and then use kill <PID> to stop that process.

  4. Limiting process resources is also a very useful technique. Execute stress -c 3 and visualize CPU consumption using htop. Now, execute taskset --cpu-list 0,2 stress -c 3 and visualize. Is stress using 3 CPUs? Why not? Read man taskset to find the answer. Bonus: Use cgroups to achieve the same operation, limiting the memory usage of stress -m.

  5. (Advanced) Execute the curl ipinfo.io command or make an HTTP request to get information about your IP. Open Wireshark and capture the requests and responses initiated by curl. (Hint: You can use http to filter and only display HTTP packets)

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.