3/26/2024
I built a real-time network usage tracker that utilizes eBPF (Extended Berkeley Packet Filter) to measure send and receive bandwidth on a per-process basis. It supports per-remote IP tracking, protocol filtering, and historical usage storage.
Why eBPF?
eBPF allows us to run sandboxed programs in a privileged context such as the operating system kernel. It’s incredibly fast and efficient because it executes right where the network stack processes data, eliminating the need to constantly copy packets to user space for analysis.
Features
- Real-time monitoring: Track network bandwidth usage per process in real-time.
- Per-IP tracking: Monitor traffic going to or coming from specific remote IP addresses.
- Protocol filtering: Distinctly separate TCP and UDP traffic statistics.
- Historical storage: Store usage data in an SQLite database for long-term reporting.
- Web UI & CLI: A simple Flask web interface for visualization, plus a command-line tool for quick stats directly in the terminal.
Architecture
The project is split into Kernel Space and User Space components:
Kernel Space (eBPF)
We attach eBPF probes directly to kernel network functions:
tcp_sendmsgandtcp_recvmsgfor TCP tracking.udp_sendmsgandudp_recvmsgfor UDP tracking.
For each network operation, the eBPF program records the PID, process name, remote IP, protocol, bytes transfered, and a timestamp. All this aggregation happens rapidly inside eBPF maps.
User Space (Python/BCC)
A Python program using the BCC toolkit periodically reads these eBPF maps, aggregates the data by process, and stores it in SQLite. A Flask API then exposes this data to the visualization layer.
Performance and Security Considerations
Because the heavy lifting (aggregation and filtering) happens in kernel space before ever moving to user space, the performance impact is minimal. The tool easily scales to handle high-traffic systems.
From a security standpoint:
- The tracker requires
rootprivileges to load the eBPF programs into the kernel. - It only tracks traffic generated by local processes, not forwarded traffic.
- All historical data is kept completely local via SQLite.
You can check out the full source code and installation instructions on the GitHub Repository!