438 lines
20 KiB
Markdown
438 lines
20 KiB
Markdown
# StreamLens - Comprehensive Development Guide
|
||
|
||
## Project Overview
|
||
|
||
Build a sophisticated Python-based network traffic analysis tool called "StreamLens" that analyzes both PCAP files and live network streams. The tool specializes in telemetry and avionics protocols with advanced statistical timing analysis, outlier detection, and sigma-based flow prioritization. Features a modern modular architecture with both a text-based user interface (TUI) and a professional PySide6 GUI with interactive matplotlib signal visualization.
|
||
|
||
## Core Requirements
|
||
|
||
### Primary Functionality
|
||
- **PCAP Analysis**: Load and analyze packet capture files with file validation and info extraction
|
||
- **Live Capture**: Real-time network traffic monitoring with threading support
|
||
- **Flow Analysis**: Group packets by source-destination IP pairs with comprehensive statistics
|
||
- **Advanced Statistical Analysis**: Configurable sigma thresholds, running averages, coefficient of variation
|
||
- **Sigma-Based Outlier Detection**: Identify packets with timing deviations using configurable thresholds (default 3σ)
|
||
- **Flow Prioritization**: Automatically sort flows by largest sigma deviation for efficient analysis
|
||
- **Interactive TUI**: Three-panel interface with real-time updates and navigation
|
||
- **Modern GUI Interface**: Professional PySide6-based GUI with embedded matplotlib plots
|
||
- **Modular Architecture**: Clean separation of concerns with analyzers, models, protocols, TUI, GUI, and utilities
|
||
|
||
### Advanced Features
|
||
- **Specialized Protocol Support**: Chapter 10 (IRIG106), PTP (IEEE 1588), IENA (Airbus)
|
||
- **Frame Type Classification**: Hierarchical breakdown by protocol and frame types with per-type statistics
|
||
- **Timeline Visualization**: ASCII-based timing deviation charts with dynamic scaling
|
||
- **Real-time Statistics**: Running averages and outlier detection during live capture
|
||
- **Comprehensive Reporting**: Detailed outlier reports with sigma deviation calculations
|
||
- **High Jitter Detection**: Coefficient of variation analysis for identifying problematic flows
|
||
- **Configurable Analysis**: Adjustable outlier thresholds and analysis parameters
|
||
- **Chapter 10 Signal Visualization**: Real-time matplotlib-based signal plotting with TMATS integration
|
||
- **Interactive Signal Analysis**: Press 'v' in TUI to generate signal files, or use GUI for embedded interactive plots
|
||
- **Threading-Safe Visualization**: Proper Qt integration for GUI, file output for TUI to prevent segmentation faults
|
||
- **Cross-Platform GUI**: PySide6-based interface with file dialogs, progress bars, and embedded matplotlib widgets
|
||
|
||
## Architecture Overview
|
||
|
||
### Core Components
|
||
|
||
#### 1. Data Structures (`dataclasses`)
|
||
```python
|
||
@dataclass
|
||
class FrameTypeStats:
|
||
frame_type: str
|
||
count: int
|
||
total_bytes: int
|
||
timestamps: List[float]
|
||
frame_numbers: List[int]
|
||
inter_arrival_times: List[float]
|
||
avg_inter_arrival: float
|
||
std_inter_arrival: float
|
||
outlier_frames: List[int]
|
||
outlier_details: List[Tuple[int, float]]
|
||
|
||
@dataclass
|
||
class FlowStats:
|
||
src_ip: str
|
||
dst_ip: str
|
||
frame_count: int
|
||
timestamps: List[float]
|
||
frame_numbers: List[int]
|
||
inter_arrival_times: List[float]
|
||
avg_inter_arrival: float
|
||
std_inter_arrival: float
|
||
outlier_frames: List[int]
|
||
outlier_details: List[Tuple[int, float]]
|
||
total_bytes: int
|
||
protocols: Set[str]
|
||
detected_protocol_types: Set[str]
|
||
frame_types: Dict[str, FrameTypeStats]
|
||
```
|
||
|
||
#### 2. Main Analysis Engine (`EthernetAnalyzer`)
|
||
- **Packet Processing**: Single-packet and batch processing capabilities
|
||
- **Flow Management**: Track unique source-destination pairs
|
||
- **Protocol Detection**: Multi-layer protocol identification
|
||
- **Statistical Calculation**: Mean, standard deviation, outlier detection using 2σ threshold
|
||
- **Live Capture**: Threading support for real-time analysis
|
||
|
||
#### 3. Enhanced Frame Dissection System
|
||
|
||
##### Base Dissector Interface
|
||
```python
|
||
class EnhancedFrameDissector:
|
||
def dissect_frame(self, packet: Packet, frame_num: int) -> Dict[str, Any]
|
||
```
|
||
|
||
##### Specialized Protocol Dissectors
|
||
|
||
**Chapter 10 Dissector** (`Chapter10Dissector`):
|
||
- IRIG106-17 telemetry standard support
|
||
- Sync pattern detection (0xEB25)
|
||
- Header parsing: channel ID, data types, timing counters
|
||
- Data format support: Analog, PCM, Ethernet Format 0
|
||
- Container format handling (embedded Ch10 in other protocols)
|
||
|
||
**PTP Dissector** (`PTPDissector`):
|
||
- IEEE 1588-2019 Precision Time Protocol
|
||
- Message types: Sync, Delay_Req, Announce, etc.
|
||
- Timestamp extraction and correction field parsing
|
||
- Grandmaster clock quality analysis
|
||
|
||
**IENA Dissector** (`IENADissector`):
|
||
- Airbus Improved Ethernet Network Architecture
|
||
- Packet types: P-type, D-type, N-type, M-type, Q-type
|
||
- Parameter ID extraction and delay analysis
|
||
- Variable-length message parsing
|
||
|
||
#### 4. Chapter 10 Packet System (`chapter10_packet.py`)
|
||
|
||
##### Core Packet Class
|
||
```python
|
||
class Chapter10Packet:
|
||
def __init__(self, packet, original_frame_num: Optional[int] = None)
|
||
def _parse_ch10_header(self) -> Optional[Dict]
|
||
def get_data_payload(self) -> Optional[bytes]
|
||
def decode_data(self, tmats_scaling_dict, tmats_content) -> Optional[DecodedData]
|
||
```
|
||
|
||
##### Data Decoders
|
||
- **AnalogDecoder**: Multi-channel analog data with scaling
|
||
- **PCMDecoder**: Pulse Code Modulation format
|
||
- **AnalogTimeFormatDecoder**: Time-synchronized analog data with TMATS integration
|
||
|
||
##### TMATS Integration
|
||
- Scaling parameter extraction from TMATS content
|
||
- Channel configuration parsing
|
||
- Engineering unit conversion
|
||
- Gain/offset/full-scale calculations
|
||
|
||
#### 5. Text User Interface (`TUIInterface`)
|
||
|
||
##### Three-Panel Layout
|
||
```
|
||
┌─ FLOWS (Left 60%) ──┐│ DETAILS (Right 40%) ─┐
|
||
│ IP flows + frame ││ Flow info + table │
|
||
│ type breakdowns ││ Frame type details │
|
||
├────────────────────── ────────────────────────┤
|
||
│ TIMING VISUALIZATION (Bottom 30%) │
|
||
│ ASCII timeline with deviation plotting │
|
||
└───────────────────────────────────────────────┘
|
||
```
|
||
|
||
##### Key Interface Features
|
||
- **Flow List**: Hierarchical display of flows and frame types
|
||
- **Detail Panel**:
|
||
- Flow summary (packets, bytes, protocols)
|
||
- Frame type table: `Type | #Pkts | Bytes | Avg ΔT | 2σ Out`
|
||
- Timing statistics and threshold calculations
|
||
- **Timeline Panel**: ASCII deviation chart with scaling
|
||
- **Navigation**: Arrow keys, view switching (main/dissection)
|
||
|
||
## Technical Implementation Details
|
||
|
||
### Statistical Analysis Algorithm
|
||
```python
|
||
def calculate_statistics(self):
|
||
for flow in self.flows.values():
|
||
# Calculate inter-arrival times
|
||
flow.avg_inter_arrival = statistics.mean(flow.inter_arrival_times)
|
||
flow.std_inter_arrival = statistics.stdev(flow.inter_arrival_times)
|
||
|
||
# Detect outliers (>2σ threshold)
|
||
threshold = flow.avg_inter_arrival + (2 * flow.std_inter_arrival)
|
||
for i, inter_time in enumerate(flow.inter_arrival_times):
|
||
if inter_time > threshold:
|
||
frame_number = flow.frame_numbers[i + 1]
|
||
flow.outlier_frames.append(frame_number)
|
||
flow.outlier_details.append((frame_number, inter_time))
|
||
```
|
||
|
||
### Protocol Classification Hierarchy
|
||
1. **Specialized Protocols**: Chapter 10, PTP, IENA (highest priority)
|
||
2. **Port-based Detection**: DNS (53), HTTP (80), NTP (123), etc.
|
||
3. **Transport Layer**: TCP, UDP, ICMP, IGMP
|
||
4. **Fallback**: Generic classification
|
||
|
||
### Frame Type Classification Logic
|
||
```python
|
||
def _classify_frame_type(self, packet, dissection):
|
||
layers = dissection.get('layers', {})
|
||
|
||
# Chapter 10 takes precedence
|
||
if 'chapter10' in layers:
|
||
if self._is_tmats_frame(packet, layers['chapter10']):
|
||
return 'TMATS'
|
||
else:
|
||
return 'CH10-Data'
|
||
|
||
# PTP message types
|
||
if 'ptp' in layers:
|
||
msg_type = layers['ptp'].get('message_type_name', 'Unknown')
|
||
return f'PTP-{msg_type}'
|
||
|
||
# IENA packet types
|
||
if 'iena' in layers:
|
||
packet_type = layers['iena'].get('packet_type_name', 'Unknown')
|
||
return f'IENA-{packet_type}'
|
||
|
||
# Fallback to transport/application protocols
|
||
return self._classify_standard_protocols(packet)
|
||
```
|
||
|
||
### Timeline Visualization Algorithm
|
||
- **Deviation Calculation**: `deviation = inter_arrival_time - average_inter_arrival`
|
||
- **Scaling**: Vertical position based on deviation magnitude
|
||
- **Character Selection**:
|
||
- `·` for normal timing (within 10% of average)
|
||
- `•`/`○` for moderate deviations (within 50%)
|
||
- `█`/`▄` for significant outliers
|
||
- **Time Mapping**: Horizontal position maps to actual timestamps
|
||
|
||
## Dependencies and Requirements
|
||
|
||
### Core Libraries
|
||
- **scapy**: Packet capture and parsing (`pip install scapy`)
|
||
- **numpy**: Numerical computations (`pip install numpy`)
|
||
- **matplotlib**: Signal visualization and plotting (`pip install matplotlib`)
|
||
- **PySide6**: Modern Qt-based GUI framework (`pip install PySide6`)
|
||
- **tkinter**: GUI backend for matplotlib (usually included with Python)
|
||
- **curses**: Terminal UI framework (built-in on Unix systems)
|
||
- **statistics**: Statistical calculations (built-in)
|
||
- **struct**: Binary data parsing (built-in)
|
||
- **threading**: Live capture support (built-in)
|
||
|
||
### Modern Modular File Structure
|
||
```
|
||
streamlens/
|
||
├── streamlens.py # Main entry point
|
||
├── analyzer/ # Core analysis package
|
||
│ ├── __init__.py # Package initialization
|
||
│ ├── main.py # CLI handling and main application logic
|
||
│ ├── analysis/ # Analysis engine modules
|
||
│ │ ├── __init__.py # Analysis package init
|
||
│ │ ├── core.py # EthernetAnalyzer main class
|
||
│ │ ├── flow_manager.py # FlowManager for packet processing
|
||
│ │ └── statistics.py # StatisticsEngine with sigma calculations
|
||
│ ├── models/ # Data structures and models
|
||
│ │ ├── __init__.py # Models package init
|
||
│ │ ├── flow_stats.py # FlowStats and FrameTypeStats dataclasses
|
||
│ │ └── analysis_results.py # AnalysisResult containers
|
||
│ ├── protocols/ # Protocol dissector system
|
||
│ │ ├── __init__.py # Protocols package init
|
||
│ │ ├── base.py # ProtocolDissector base classes
|
||
│ │ ├── chapter10.py # Chapter10Dissector (IRIG106)
|
||
│ │ ├── ptp.py # PTPDissector (IEEE 1588)
|
||
│ │ ├── iena.py # IENADissector (Airbus)
|
||
│ │ └── standard.py # StandardProtocolDissector
|
||
│ ├── gui/ # Modern GUI Interface system (NEW!)
|
||
│ │ ├── __init__.py # GUI package init
|
||
│ │ └── main_window.py # StreamLensMainWindow with PySide6 and matplotlib
|
||
│ ├── tui/ # Text User Interface system
|
||
│ │ ├── __init__.py # TUI package init
|
||
│ │ ├── interface.py # TUIInterface main controller
|
||
│ │ ├── navigation.py # NavigationHandler for input handling
|
||
│ │ └── panels/ # UI panel components
|
||
│ │ ├── __init__.py # Panels package init
|
||
│ │ ├── flow_list.py # FlowListPanel for flow display
|
||
│ │ ├── detail_panel.py # DetailPanel for flow details
|
||
│ │ └── timeline.py # TimelinePanel for visualization
|
||
│ └── utils/ # Utility modules
|
||
│ ├── __init__.py # Utils package init
|
||
│ ├── pcap_loader.py # PCAPLoader for file handling
|
||
│ ├── live_capture.py # LiveCapture for network monitoring
|
||
│ └── signal_visualizer.py # Chapter 10 signal visualization (thread-safe)
|
||
├── *.pcapng # Sample capture files for testing
|
||
├── README.md # User guide and quick start
|
||
└── ai.comprehensive_replay.md # This comprehensive development guide
|
||
```
|
||
|
||
## Command Line Interface
|
||
```bash
|
||
# Launch modern GUI with interactive plots (RECOMMENDED)
|
||
python streamlens.py --gui --pcap file.pcap
|
||
|
||
# GUI mode only (then open file via File menu)
|
||
python streamlens.py --gui
|
||
|
||
# Analyze PCAP file with TUI (flows sorted by largest sigma outliers)
|
||
python streamlens.py --pcap file.pcap
|
||
|
||
# Console output mode with sigma deviation display
|
||
python streamlens.py --pcap file.pcap --no-tui
|
||
|
||
# Generate comprehensive outlier report
|
||
python streamlens.py --pcap file.pcap --report
|
||
|
||
# Get PCAP file information only
|
||
python streamlens.py --pcap file.pcap --info
|
||
|
||
# Live capture with real-time statistics
|
||
python streamlens.py --live --interface eth0
|
||
|
||
# Configure outlier threshold (default: 3.0 sigma)
|
||
python streamlens.py --pcap file.pcap --outlier-threshold 2.0
|
||
|
||
# With BPF filtering for targeted capture
|
||
python streamlens.py --live --filter "port 319 or port 320"
|
||
```
|
||
|
||
## Key Algorithms and Techniques
|
||
|
||
### 1. Inter-arrival Time Calculation
|
||
- Track timestamps for each packet in a flow
|
||
- Calculate time differences between consecutive packets
|
||
- Handle both flow-level and frame-type-level statistics
|
||
|
||
### 2. Advanced Outlier Detection and Flow Prioritization
|
||
- Use configurable sigma rule: outliers are packets with inter-arrival times > (mean + N×std_dev)
|
||
- Default threshold: 3σ (configurable via --outlier-threshold parameter)
|
||
- Calculate maximum sigma deviation for each flow across all outliers
|
||
- Store both frame numbers and actual time deltas for detailed analysis
|
||
- Apply at both flow and frame-type granularity
|
||
- **Key Innovation**: Sort flows by largest sigma deviation to prioritize most problematic flows
|
||
|
||
```python
|
||
def get_max_sigma_deviation(self, flow: FlowStats) -> float:
|
||
"""Get the maximum sigma deviation for any outlier in this flow"""
|
||
max_sigma = 0.0
|
||
|
||
# Check flow-level outliers
|
||
if flow.outlier_details and flow.std_inter_arrival > 0:
|
||
for frame_num, inter_arrival_time in flow.outlier_details:
|
||
sigma_deviation = (inter_arrival_time - flow.avg_inter_arrival) / flow.std_inter_arrival
|
||
max_sigma = max(max_sigma, sigma_deviation)
|
||
|
||
# Check frame-type-level outliers
|
||
for frame_type, ft_stats in flow.frame_types.items():
|
||
if ft_stats.outlier_details and ft_stats.std_inter_arrival > 0:
|
||
for frame_num, inter_arrival_time in ft_stats.outlier_details:
|
||
sigma_deviation = (inter_arrival_time - ft_stats.avg_inter_arrival) / ft_stats.std_inter_arrival
|
||
max_sigma = max(max_sigma, sigma_deviation)
|
||
|
||
return max_sigma
|
||
```
|
||
|
||
### 3. Protocol Dissection Pipeline
|
||
- Layer-by-layer parsing starting with Ethernet/IP/UDP
|
||
- Specialized protocol detection based on ports, patterns, payload analysis
|
||
- Error handling and graceful degradation for malformed packets
|
||
|
||
### 4. TMATS Parameter Parsing
|
||
- Line-by-line parsing of TMATS configuration data
|
||
- Key-value pair extraction with hierarchical channel mapping
|
||
- Parameter validation and default value assignment
|
||
|
||
### 5. Live Capture Architecture with Real-time Statistics
|
||
- Threaded packet capture using scapy.sniff()
|
||
- Real-time running statistical updates during capture
|
||
- Thread-safe data structures and stop conditions
|
||
- Running averages calculated incrementally for performance
|
||
- Live outlier detection with immediate flagging
|
||
- TUI updates every 0.5 seconds during live capture
|
||
|
||
### 6. Chapter 10 Signal Visualization System
|
||
- **TMATS Parser**: Extracts channel metadata from Telemetry Attributes Transfer Standard frames
|
||
- **Signal Decoders**: Support for analog and PCM format data with proper scaling
|
||
- **Matplotlib Integration**: External plotting windows with interactive capabilities
|
||
- **Real-time Visualization**: Works for both PCAP analysis and live capture modes
|
||
- **Multi-channel Display**: Simultaneous plotting of multiple signal channels with engineering units
|
||
|
||
```python
|
||
class SignalVisualizer:
|
||
def visualize_flow_signals(self, flow: FlowStats, packets: List[Packet]) -> None:
|
||
# Extract TMATS metadata for channel configurations
|
||
tmats_metadata = self._extract_tmats_from_flow(packets)
|
||
|
||
# Decode signal data using Chapter 10 decoders
|
||
signal_data = self._extract_signals_from_flow(packets, tmats_metadata)
|
||
|
||
# Launch matplotlib window in background thread
|
||
self._create_signal_window(flow_key, signal_data, flow)
|
||
```
|
||
|
||
### 7. PySide6 GUI Architecture with Threading Safety
|
||
- **Professional Qt Interface**: Cross-platform GUI built with PySide6 for native look and feel
|
||
- **Embedded Matplotlib Integration**: Interactive plots with zoom, pan, and navigation toolbar
|
||
- **Background Processing**: Threading for PCAP loading with progress bar and non-blocking UI
|
||
- **Flow List Widget**: Sortable table with sigma deviations, protocols, and frame types
|
||
- **Signal Visualization**: Click-to-visualize Chapter 10 flows with embedded matplotlib widgets
|
||
- **Threading Safety**: Proper Qt integration prevents matplotlib segmentation faults
|
||
|
||
```python
|
||
class StreamLensMainWindow(QMainWindow):
|
||
def __init__(self):
|
||
# Create main interface with flow list and plot area
|
||
self.flows_table = QTableWidget() # Sortable flow list
|
||
self.plot_widget = PlotWidget() # Embedded matplotlib
|
||
|
||
def load_pcap_file(self, file_path: str):
|
||
# Background loading with progress bar
|
||
self.loading_thread = PCAPLoadThread(file_path)
|
||
self.loading_thread.progress_updated.connect(self.progress_bar.setValue)
|
||
self.loading_thread.loading_finished.connect(self.on_pcap_loaded)
|
||
|
||
def visualize_selected_flow(self):
|
||
# Interactive signal visualization
|
||
signal_data = signal_visualizer._extract_signals_from_flow(packets, tmats)
|
||
self.plot_widget.plot_flow_signals(flow, signal_data, flow_key)
|
||
```
|
||
|
||
### 8. Modular Architecture Design
|
||
- **Separation of Concerns**: Clean boundaries between analysis, UI, protocols, and utilities
|
||
- **Package Structure**: Logical grouping of related functionality
|
||
- **Dependency Injection**: Components receive dependencies through constructors
|
||
- **Interface-based Design**: Protocol dissectors implement common interfaces
|
||
- **Error Handling**: Graceful degradation and comprehensive error reporting
|
||
|
||
## Extensibility Points
|
||
|
||
### Adding New Protocol Dissectors
|
||
1. Implement dissector class with `dissect(packet)` method
|
||
2. Return `DissectionResult` with protocol type and parsed fields
|
||
3. Register in `EnhancedFrameDissector.dissectors` dictionary
|
||
|
||
### Custom Frame Type Classification
|
||
- Override `_classify_frame_type()` method
|
||
- Add custom logic based on dissection results
|
||
- Return descriptive frame type strings
|
||
|
||
### Statistical Analysis Extensions
|
||
- Extend `FrameTypeStats` and `FlowStats` dataclasses
|
||
- Add custom statistical calculations in `calculate_statistics()`
|
||
- Implement additional outlier detection algorithms
|
||
|
||
## Performance Considerations
|
||
|
||
- **Memory Management**: Use generators for large PCAP files
|
||
- **Threading**: Separate capture and analysis threads for live mode
|
||
- **Statistical Efficiency**: Incremental statistics calculation
|
||
- **TUI Optimization**: Efficient screen drawing with curses error handling
|
||
|
||
## Testing and Validation
|
||
|
||
The project includes comprehensive test suites:
|
||
- **Protocol Dissection Tests**: Validate parsing accuracy
|
||
- **Statistical Analysis Tests**: Verify timing calculations
|
||
- **TUI Layout Tests**: Interface rendering validation
|
||
- **Integration Tests**: End-to-end workflow verification
|
||
|
||
This comprehensive description captures the full scope and technical depth of the Ethernet Traffic Analyzer, enabling recreation of this sophisticated telemetry analysis tool. |