2025-07-25 19:42:33 -04:00
# StreamLens - Comprehensive Development Guide
## Project Overview
2025-07-25 21:45:07 -04:00
Build a sophisticated Python-based network traffic analysis tool called "StreamLens" that analyzes both PCAP files and live network streams. The tool specializes in telemetry and avionics protocols with advanced statistical timing analysis, outlier detection, and sigma-based flow prioritization. Features a modern modular architecture with both a text-based user interface (TUI) and a professional PySide6 GUI with interactive matplotlib signal visualization.
2025-07-25 19:42:33 -04:00
## Core Requirements
### Primary Functionality
- **PCAP Analysis**: Load and analyze packet capture files with file validation and info extraction
- **Live Capture**: Real-time network traffic monitoring with threading support
- **Flow Analysis**: Group packets by source-destination IP pairs with comprehensive statistics
- **Advanced Statistical Analysis**: Configurable sigma thresholds, running averages, coefficient of variation
- **Sigma-Based Outlier Detection**: Identify packets with timing deviations using configurable thresholds (default 3σ )
- **Flow Prioritization**: Automatically sort flows by largest sigma deviation for efficient analysis
- **Interactive TUI**: Three-panel interface with real-time updates and navigation
2025-07-25 21:45:07 -04:00
- **Modern GUI Interface**: Professional PySide6-based GUI with embedded matplotlib plots
- **Modular Architecture**: Clean separation of concerns with analyzers, models, protocols, TUI, GUI, and utilities
2025-07-25 19:42:33 -04:00
### Advanced Features
- **Specialized Protocol Support**: Chapter 10 (IRIG106), PTP (IEEE 1588), IENA (Airbus)
- **Frame Type Classification**: Hierarchical breakdown by protocol and frame types with per-type statistics
- **Timeline Visualization**: ASCII-based timing deviation charts with dynamic scaling
- **Real-time Statistics**: Running averages and outlier detection during live capture
- **Comprehensive Reporting**: Detailed outlier reports with sigma deviation calculations
- **High Jitter Detection**: Coefficient of variation analysis for identifying problematic flows
- **Configurable Analysis**: Adjustable outlier thresholds and analysis parameters
2025-07-25 21:45:07 -04:00
- **Chapter 10 Signal Visualization**: Real-time matplotlib-based signal plotting with TMATS integration
- **Interactive Signal Analysis**: Press 'v' in TUI to generate signal files, or use GUI for embedded interactive plots
- **Threading-Safe Visualization**: Proper Qt integration for GUI, file output for TUI to prevent segmentation faults
- **Cross-Platform GUI**: PySide6-based interface with file dialogs, progress bars, and embedded matplotlib widgets
2025-07-25 19:42:33 -04:00
## Architecture Overview
### Core Components
#### 1. Data Structures (`dataclasses`)
```python
@dataclass
class FrameTypeStats:
frame_type: str
count: int
total_bytes: int
timestamps: List[float]
frame_numbers: List[int]
inter_arrival_times: List[float]
avg_inter_arrival: float
std_inter_arrival: float
outlier_frames: List[int]
outlier_details: List[Tuple[int, float]]
@dataclass
class FlowStats:
src_ip: str
dst_ip: str
frame_count: int
timestamps: List[float]
frame_numbers: List[int]
inter_arrival_times: List[float]
avg_inter_arrival: float
std_inter_arrival: float
outlier_frames: List[int]
outlier_details: List[Tuple[int, float]]
total_bytes: int
protocols: Set[str]
detected_protocol_types: Set[str]
frame_types: Dict[str, FrameTypeStats]
```
#### 2. Main Analysis Engine (`EthernetAnalyzer`)
- **Packet Processing**: Single-packet and batch processing capabilities
- **Flow Management**: Track unique source-destination pairs
- **Protocol Detection**: Multi-layer protocol identification
- **Statistical Calculation**: Mean, standard deviation, outlier detection using 2σ threshold
- **Live Capture**: Threading support for real-time analysis
#### 3. Enhanced Frame Dissection System
##### Base Dissector Interface
```python
class EnhancedFrameDissector:
def dissect_frame(self, packet: Packet, frame_num: int) -> Dict[str, Any]
```
##### Specialized Protocol Dissectors
**Chapter 10 Dissector** (`Chapter10Dissector` ):
- IRIG106-17 telemetry standard support
- Sync pattern detection (0xEB25)
- Header parsing: channel ID, data types, timing counters
- Data format support: Analog, PCM, Ethernet Format 0
- Container format handling (embedded Ch10 in other protocols)
**PTP Dissector** (`PTPDissector` ):
- IEEE 1588-2019 Precision Time Protocol
- Message types: Sync, Delay_Req, Announce, etc.
- Timestamp extraction and correction field parsing
- Grandmaster clock quality analysis
**IENA Dissector** (`IENADissector` ):
- Airbus Improved Ethernet Network Architecture
- Packet types: P-type, D-type, N-type, M-type, Q-type
- Parameter ID extraction and delay analysis
- Variable-length message parsing
#### 4. Chapter 10 Packet System (`chapter10_packet.py`)
##### Core Packet Class
```python
class Chapter10Packet:
def __init__ (self, packet, original_frame_num: Optional[int] = None)
def _parse_ch10_header(self) -> Optional[Dict]
def get_data_payload(self) -> Optional[bytes]
def decode_data(self, tmats_scaling_dict, tmats_content) -> Optional[DecodedData]
```
##### Data Decoders
- **AnalogDecoder**: Multi-channel analog data with scaling
- **PCMDecoder**: Pulse Code Modulation format
- **AnalogTimeFormatDecoder**: Time-synchronized analog data with TMATS integration
##### TMATS Integration
- Scaling parameter extraction from TMATS content
- Channel configuration parsing
- Engineering unit conversion
- Gain/offset/full-scale calculations
#### 5. Text User Interface (`TUIInterface`)
##### Three-Panel Layout
```
┌─ FLOWS (Left 60%) ──┐│ DETAILS (Right 40%) ─┐
│ IP flows + frame ││ Flow info + table │
│ type breakdowns ││ Frame type details │
├────────────────────── ────────────────────────┤
│ TIMING VISUALIZATION (Bottom 30%) │
│ ASCII timeline with deviation plotting │
└───────────────────────────────────────────────┘
```
##### Key Interface Features
- **Flow List**: Hierarchical display of flows and frame types
- **Detail Panel**:
- Flow summary (packets, bytes, protocols)
- Frame type table: `Type | #Pkts | Bytes | Avg ΔT | 2σ Out`
- Timing statistics and threshold calculations
- **Timeline Panel**: ASCII deviation chart with scaling
- **Navigation**: Arrow keys, view switching (main/dissection)
## Technical Implementation Details
### Statistical Analysis Algorithm
```python
def calculate_statistics(self):
for flow in self.flows.values():
# Calculate inter-arrival times
flow.avg_inter_arrival = statistics.mean(flow.inter_arrival_times)
flow.std_inter_arrival = statistics.stdev(flow.inter_arrival_times)
# Detect outliers (>2σ threshold)
threshold = flow.avg_inter_arrival + (2 * flow.std_inter_arrival)
for i, inter_time in enumerate(flow.inter_arrival_times):
if inter_time > threshold:
frame_number = flow.frame_numbers[i + 1]
flow.outlier_frames.append(frame_number)
flow.outlier_details.append((frame_number, inter_time))
```
### Protocol Classification Hierarchy
1. **Specialized Protocols** : Chapter 10, PTP, IENA (highest priority)
2. **Port-based Detection** : DNS (53), HTTP (80), NTP (123), etc.
3. **Transport Layer** : TCP, UDP, ICMP, IGMP
4. **Fallback** : Generic classification
### Frame Type Classification Logic
```python
def _classify_frame_type(self, packet, dissection):
layers = dissection.get('layers', {})
# Chapter 10 takes precedence
if 'chapter10' in layers:
if self._is_tmats_frame(packet, layers['chapter10']):
return 'TMATS'
else:
return 'CH10-Data'
# PTP message types
if 'ptp' in layers:
msg_type = layers['ptp'].get('message_type_name', 'Unknown')
return f'PTP-{msg_type}'
# IENA packet types
if 'iena' in layers:
packet_type = layers['iena'].get('packet_type_name', 'Unknown')
return f'IENA-{packet_type}'
# Fallback to transport/application protocols
return self._classify_standard_protocols(packet)
```
### Timeline Visualization Algorithm
- **Deviation Calculation**: `deviation = inter_arrival_time - average_inter_arrival`
- **Scaling**: Vertical position based on deviation magnitude
- **Character Selection**:
- `·` for normal timing (within 10% of average)
- `•` /`○` for moderate deviations (within 50%)
- `█` /`▄` for significant outliers
- **Time Mapping**: Horizontal position maps to actual timestamps
## Dependencies and Requirements
### Core Libraries
- **scapy**: Packet capture and parsing (`pip install scapy` )
- **numpy**: Numerical computations (`pip install numpy` )
2025-07-25 21:45:07 -04:00
- **matplotlib**: Signal visualization and plotting (`pip install matplotlib` )
- **PySide6**: Modern Qt-based GUI framework (`pip install PySide6` )
- **tkinter**: GUI backend for matplotlib (usually included with Python)
2025-07-25 19:42:33 -04:00
- **curses**: Terminal UI framework (built-in on Unix systems)
- **statistics**: Statistical calculations (built-in)
- **struct**: Binary data parsing (built-in)
- **threading**: Live capture support (built-in)
### Modern Modular File Structure
```
streamlens/
2025-07-25 21:45:07 -04:00
├── streamlens.py # Main entry point
2025-07-25 19:42:33 -04:00
├── analyzer/ # Core analysis package
│ ├── __init__ .py # Package initialization
│ ├── main.py # CLI handling and main application logic
│ ├── analysis/ # Analysis engine modules
│ │ ├── __init__ .py # Analysis package init
│ │ ├── core.py # EthernetAnalyzer main class
│ │ ├── flow_manager.py # FlowManager for packet processing
│ │ └── statistics.py # StatisticsEngine with sigma calculations
│ ├── models/ # Data structures and models
│ │ ├── __init__ .py # Models package init
│ │ ├── flow_stats.py # FlowStats and FrameTypeStats dataclasses
│ │ └── analysis_results.py # AnalysisResult containers
│ ├── protocols/ # Protocol dissector system
│ │ ├── __init__ .py # Protocols package init
│ │ ├── base.py # ProtocolDissector base classes
│ │ ├── chapter10.py # Chapter10Dissector (IRIG106)
│ │ ├── ptp.py # PTPDissector (IEEE 1588)
│ │ ├── iena.py # IENADissector (Airbus)
│ │ └── standard.py # StandardProtocolDissector
2025-07-25 21:45:07 -04:00
│ ├── gui/ # Modern GUI Interface system (NEW!)
│ │ ├── __init__ .py # GUI package init
│ │ └── main_window.py # StreamLensMainWindow with PySide6 and matplotlib
2025-07-25 19:42:33 -04:00
│ ├── tui/ # Text User Interface system
│ │ ├── __init__ .py # TUI package init
│ │ ├── interface.py # TUIInterface main controller
│ │ ├── navigation.py # NavigationHandler for input handling
│ │ └── panels/ # UI panel components
│ │ ├── __init__ .py # Panels package init
│ │ ├── flow_list.py # FlowListPanel for flow display
│ │ ├── detail_panel.py # DetailPanel for flow details
│ │ └── timeline.py # TimelinePanel for visualization
│ └── utils/ # Utility modules
│ ├── __init__ .py # Utils package init
│ ├── pcap_loader.py # PCAPLoader for file handling
2025-07-25 21:45:07 -04:00
│ ├── live_capture.py # LiveCapture for network monitoring
│ └── signal_visualizer.py # Chapter 10 signal visualization (thread-safe)
2025-07-25 19:42:33 -04:00
├── *.pcapng # Sample capture files for testing
├── README.md # User guide and quick start
└── ai.comprehensive_replay.md # This comprehensive development guide
```
## Command Line Interface
```bash
2025-07-25 21:45:07 -04:00
# Launch modern GUI with interactive plots (RECOMMENDED)
python streamlens.py --gui --pcap file.pcap
# GUI mode only (then open file via File menu)
python streamlens.py --gui
2025-07-25 19:42:33 -04:00
# Analyze PCAP file with TUI (flows sorted by largest sigma outliers)
2025-07-25 21:45:07 -04:00
python streamlens.py --pcap file.pcap
2025-07-25 19:42:33 -04:00
# Console output mode with sigma deviation display
2025-07-25 21:45:07 -04:00
python streamlens.py --pcap file.pcap --no-tui
2025-07-25 19:42:33 -04:00
# Generate comprehensive outlier report
2025-07-25 21:45:07 -04:00
python streamlens.py --pcap file.pcap --report
2025-07-25 19:42:33 -04:00
# Get PCAP file information only
2025-07-25 21:45:07 -04:00
python streamlens.py --pcap file.pcap --info
2025-07-25 19:42:33 -04:00
# Live capture with real-time statistics
2025-07-25 21:45:07 -04:00
python streamlens.py --live --interface eth0
2025-07-25 19:42:33 -04:00
# Configure outlier threshold (default: 3.0 sigma)
2025-07-25 21:45:07 -04:00
python streamlens.py --pcap file.pcap --outlier-threshold 2.0
2025-07-25 19:42:33 -04:00
# With BPF filtering for targeted capture
2025-07-25 21:45:07 -04:00
python streamlens.py --live --filter "port 319 or port 320"
2025-07-25 19:42:33 -04:00
```
## Key Algorithms and Techniques
### 1. Inter-arrival Time Calculation
- Track timestamps for each packet in a flow
- Calculate time differences between consecutive packets
- Handle both flow-level and frame-type-level statistics
### 2. Advanced Outlier Detection and Flow Prioritization
- Use configurable sigma rule: outliers are packets with inter-arrival times > (mean + N× std_dev)
- Default threshold: 3σ (configurable via --outlier-threshold parameter)
- Calculate maximum sigma deviation for each flow across all outliers
- Store both frame numbers and actual time deltas for detailed analysis
- Apply at both flow and frame-type granularity
- **Key Innovation**: Sort flows by largest sigma deviation to prioritize most problematic flows
```python
def get_max_sigma_deviation(self, flow: FlowStats) -> float:
"""Get the maximum sigma deviation for any outlier in this flow"""
max_sigma = 0.0
# Check flow-level outliers
if flow.outlier_details and flow.std_inter_arrival > 0:
for frame_num, inter_arrival_time in flow.outlier_details:
sigma_deviation = (inter_arrival_time - flow.avg_inter_arrival) / flow.std_inter_arrival
max_sigma = max(max_sigma, sigma_deviation)
# Check frame-type-level outliers
for frame_type, ft_stats in flow.frame_types.items():
if ft_stats.outlier_details and ft_stats.std_inter_arrival > 0:
for frame_num, inter_arrival_time in ft_stats.outlier_details:
sigma_deviation = (inter_arrival_time - ft_stats.avg_inter_arrival) / ft_stats.std_inter_arrival
max_sigma = max(max_sigma, sigma_deviation)
return max_sigma
```
### 3. Protocol Dissection Pipeline
- Layer-by-layer parsing starting with Ethernet/IP/UDP
- Specialized protocol detection based on ports, patterns, payload analysis
- Error handling and graceful degradation for malformed packets
### 4. TMATS Parameter Parsing
- Line-by-line parsing of TMATS configuration data
- Key-value pair extraction with hierarchical channel mapping
- Parameter validation and default value assignment
### 5. Live Capture Architecture with Real-time Statistics
- Threaded packet capture using scapy.sniff()
- Real-time running statistical updates during capture
- Thread-safe data structures and stop conditions
- Running averages calculated incrementally for performance
- Live outlier detection with immediate flagging
- TUI updates every 0.5 seconds during live capture
2025-07-25 21:45:07 -04:00
### 6. Chapter 10 Signal Visualization System
- **TMATS Parser**: Extracts channel metadata from Telemetry Attributes Transfer Standard frames
- **Signal Decoders**: Support for analog and PCM format data with proper scaling
- **Matplotlib Integration**: External plotting windows with interactive capabilities
- **Real-time Visualization**: Works for both PCAP analysis and live capture modes
- **Multi-channel Display**: Simultaneous plotting of multiple signal channels with engineering units
```python
class SignalVisualizer:
def visualize_flow_signals(self, flow: FlowStats, packets: List[Packet]) -> None:
# Extract TMATS metadata for channel configurations
tmats_metadata = self._extract_tmats_from_flow(packets)
# Decode signal data using Chapter 10 decoders
signal_data = self._extract_signals_from_flow(packets, tmats_metadata)
# Launch matplotlib window in background thread
self._create_signal_window(flow_key, signal_data, flow)
```
### 7. PySide6 GUI Architecture with Threading Safety
- **Professional Qt Interface**: Cross-platform GUI built with PySide6 for native look and feel
- **Embedded Matplotlib Integration**: Interactive plots with zoom, pan, and navigation toolbar
- **Background Processing**: Threading for PCAP loading with progress bar and non-blocking UI
- **Flow List Widget**: Sortable table with sigma deviations, protocols, and frame types
- **Signal Visualization**: Click-to-visualize Chapter 10 flows with embedded matplotlib widgets
- **Threading Safety**: Proper Qt integration prevents matplotlib segmentation faults
```python
class StreamLensMainWindow(QMainWindow):
def __init__ (self):
# Create main interface with flow list and plot area
self.flows_table = QTableWidget() # Sortable flow list
self.plot_widget = PlotWidget() # Embedded matplotlib
def load_pcap_file(self, file_path: str):
# Background loading with progress bar
self.loading_thread = PCAPLoadThread(file_path)
self.loading_thread.progress_updated.connect(self.progress_bar.setValue)
self.loading_thread.loading_finished.connect(self.on_pcap_loaded)
def visualize_selected_flow(self):
# Interactive signal visualization
signal_data = signal_visualizer._extract_signals_from_flow(packets, tmats)
self.plot_widget.plot_flow_signals(flow, signal_data, flow_key)
```
### 8. Modular Architecture Design
2025-07-25 19:42:33 -04:00
- **Separation of Concerns**: Clean boundaries between analysis, UI, protocols, and utilities
- **Package Structure**: Logical grouping of related functionality
- **Dependency Injection**: Components receive dependencies through constructors
- **Interface-based Design**: Protocol dissectors implement common interfaces
- **Error Handling**: Graceful degradation and comprehensive error reporting
## Extensibility Points
### Adding New Protocol Dissectors
1. Implement dissector class with `dissect(packet)` method
2. Return `DissectionResult` with protocol type and parsed fields
3. Register in `EnhancedFrameDissector.dissectors` dictionary
### Custom Frame Type Classification
- Override `_classify_frame_type()` method
- Add custom logic based on dissection results
- Return descriptive frame type strings
### Statistical Analysis Extensions
- Extend `FrameTypeStats` and `FlowStats` dataclasses
- Add custom statistical calculations in `calculate_statistics()`
- Implement additional outlier detection algorithms
## Performance Considerations
- **Memory Management**: Use generators for large PCAP files
- **Threading**: Separate capture and analysis threads for live mode
- **Statistical Efficiency**: Incremental statistics calculation
- **TUI Optimization**: Efficient screen drawing with curses error handling
## Testing and Validation
The project includes comprehensive test suites:
- **Protocol Dissection Tests**: Validate parsing accuracy
- **Statistical Analysis Tests**: Verify timing calculations
- **TUI Layout Tests**: Interface rendering validation
- **Integration Tests**: End-to-end workflow verification
This comprehensive description captures the full scope and technical depth of the Ethernet Traffic Analyzer, enabling recreation of this sophisticated telemetry analysis tool.