working well, good docs. TUI.

This commit is contained in:
2025-07-25 19:42:33 -04:00
parent 4c6e23bff8
commit f75c757b12
5 changed files with 549 additions and 5 deletions

373
ai.comprehensive_replay.md Normal file
View File

@@ -0,0 +1,373 @@
# StreamLens - Comprehensive Development Guide
## Project Overview
Build a sophisticated Python-based network traffic analysis tool called "StreamLens" that analyzes both PCAP files and live network streams. The tool specializes in telemetry and avionics protocols with advanced statistical timing analysis, outlier detection, and sigma-based flow prioritization. Features a modern modular architecture with a text-based user interface (TUI) for interactive analysis.
## Core Requirements
### Primary Functionality
- **PCAP Analysis**: Load and analyze packet capture files with file validation and info extraction
- **Live Capture**: Real-time network traffic monitoring with threading support
- **Flow Analysis**: Group packets by source-destination IP pairs with comprehensive statistics
- **Advanced Statistical Analysis**: Configurable sigma thresholds, running averages, coefficient of variation
- **Sigma-Based Outlier Detection**: Identify packets with timing deviations using configurable thresholds (default 3σ)
- **Flow Prioritization**: Automatically sort flows by largest sigma deviation for efficient analysis
- **Interactive TUI**: Three-panel interface with real-time updates and navigation
- **Modular Architecture**: Clean separation of concerns with analyzers, models, protocols, TUI, and utilities
### Advanced Features
- **Specialized Protocol Support**: Chapter 10 (IRIG106), PTP (IEEE 1588), IENA (Airbus)
- **Frame Type Classification**: Hierarchical breakdown by protocol and frame types with per-type statistics
- **Timeline Visualization**: ASCII-based timing deviation charts with dynamic scaling
- **Real-time Statistics**: Running averages and outlier detection during live capture
- **Comprehensive Reporting**: Detailed outlier reports with sigma deviation calculations
- **High Jitter Detection**: Coefficient of variation analysis for identifying problematic flows
- **Configurable Analysis**: Adjustable outlier thresholds and analysis parameters
## Architecture Overview
### Core Components
#### 1. Data Structures (`dataclasses`)
```python
@dataclass
class FrameTypeStats:
frame_type: str
count: int
total_bytes: int
timestamps: List[float]
frame_numbers: List[int]
inter_arrival_times: List[float]
avg_inter_arrival: float
std_inter_arrival: float
outlier_frames: List[int]
outlier_details: List[Tuple[int, float]]
@dataclass
class FlowStats:
src_ip: str
dst_ip: str
frame_count: int
timestamps: List[float]
frame_numbers: List[int]
inter_arrival_times: List[float]
avg_inter_arrival: float
std_inter_arrival: float
outlier_frames: List[int]
outlier_details: List[Tuple[int, float]]
total_bytes: int
protocols: Set[str]
detected_protocol_types: Set[str]
frame_types: Dict[str, FrameTypeStats]
```
#### 2. Main Analysis Engine (`EthernetAnalyzer`)
- **Packet Processing**: Single-packet and batch processing capabilities
- **Flow Management**: Track unique source-destination pairs
- **Protocol Detection**: Multi-layer protocol identification
- **Statistical Calculation**: Mean, standard deviation, outlier detection using 2σ threshold
- **Live Capture**: Threading support for real-time analysis
#### 3. Enhanced Frame Dissection System
##### Base Dissector Interface
```python
class EnhancedFrameDissector:
def dissect_frame(self, packet: Packet, frame_num: int) -> Dict[str, Any]
```
##### Specialized Protocol Dissectors
**Chapter 10 Dissector** (`Chapter10Dissector`):
- IRIG106-17 telemetry standard support
- Sync pattern detection (0xEB25)
- Header parsing: channel ID, data types, timing counters
- Data format support: Analog, PCM, Ethernet Format 0
- Container format handling (embedded Ch10 in other protocols)
**PTP Dissector** (`PTPDissector`):
- IEEE 1588-2019 Precision Time Protocol
- Message types: Sync, Delay_Req, Announce, etc.
- Timestamp extraction and correction field parsing
- Grandmaster clock quality analysis
**IENA Dissector** (`IENADissector`):
- Airbus Improved Ethernet Network Architecture
- Packet types: P-type, D-type, N-type, M-type, Q-type
- Parameter ID extraction and delay analysis
- Variable-length message parsing
#### 4. Chapter 10 Packet System (`chapter10_packet.py`)
##### Core Packet Class
```python
class Chapter10Packet:
def __init__(self, packet, original_frame_num: Optional[int] = None)
def _parse_ch10_header(self) -> Optional[Dict]
def get_data_payload(self) -> Optional[bytes]
def decode_data(self, tmats_scaling_dict, tmats_content) -> Optional[DecodedData]
```
##### Data Decoders
- **AnalogDecoder**: Multi-channel analog data with scaling
- **PCMDecoder**: Pulse Code Modulation format
- **AnalogTimeFormatDecoder**: Time-synchronized analog data with TMATS integration
##### TMATS Integration
- Scaling parameter extraction from TMATS content
- Channel configuration parsing
- Engineering unit conversion
- Gain/offset/full-scale calculations
#### 5. Text User Interface (`TUIInterface`)
##### Three-Panel Layout
```
┌─ FLOWS (Left 60%) ──┐│ DETAILS (Right 40%) ─┐
│ IP flows + frame ││ Flow info + table │
│ type breakdowns ││ Frame type details │
├────────────────────── ────────────────────────┤
│ TIMING VISUALIZATION (Bottom 30%) │
│ ASCII timeline with deviation plotting │
└───────────────────────────────────────────────┘
```
##### Key Interface Features
- **Flow List**: Hierarchical display of flows and frame types
- **Detail Panel**:
- Flow summary (packets, bytes, protocols)
- Frame type table: `Type | #Pkts | Bytes | Avg ΔT | 2σ Out`
- Timing statistics and threshold calculations
- **Timeline Panel**: ASCII deviation chart with scaling
- **Navigation**: Arrow keys, view switching (main/dissection)
## Technical Implementation Details
### Statistical Analysis Algorithm
```python
def calculate_statistics(self):
for flow in self.flows.values():
# Calculate inter-arrival times
flow.avg_inter_arrival = statistics.mean(flow.inter_arrival_times)
flow.std_inter_arrival = statistics.stdev(flow.inter_arrival_times)
# Detect outliers (>2σ threshold)
threshold = flow.avg_inter_arrival + (2 * flow.std_inter_arrival)
for i, inter_time in enumerate(flow.inter_arrival_times):
if inter_time > threshold:
frame_number = flow.frame_numbers[i + 1]
flow.outlier_frames.append(frame_number)
flow.outlier_details.append((frame_number, inter_time))
```
### Protocol Classification Hierarchy
1. **Specialized Protocols**: Chapter 10, PTP, IENA (highest priority)
2. **Port-based Detection**: DNS (53), HTTP (80), NTP (123), etc.
3. **Transport Layer**: TCP, UDP, ICMP, IGMP
4. **Fallback**: Generic classification
### Frame Type Classification Logic
```python
def _classify_frame_type(self, packet, dissection):
layers = dissection.get('layers', {})
# Chapter 10 takes precedence
if 'chapter10' in layers:
if self._is_tmats_frame(packet, layers['chapter10']):
return 'TMATS'
else:
return 'CH10-Data'
# PTP message types
if 'ptp' in layers:
msg_type = layers['ptp'].get('message_type_name', 'Unknown')
return f'PTP-{msg_type}'
# IENA packet types
if 'iena' in layers:
packet_type = layers['iena'].get('packet_type_name', 'Unknown')
return f'IENA-{packet_type}'
# Fallback to transport/application protocols
return self._classify_standard_protocols(packet)
```
### Timeline Visualization Algorithm
- **Deviation Calculation**: `deviation = inter_arrival_time - average_inter_arrival`
- **Scaling**: Vertical position based on deviation magnitude
- **Character Selection**:
- `·` for normal timing (within 10% of average)
- `•`/`○` for moderate deviations (within 50%)
- `█`/`▄` for significant outliers
- **Time Mapping**: Horizontal position maps to actual timestamps
## Dependencies and Requirements
### Core Libraries
- **scapy**: Packet capture and parsing (`pip install scapy`)
- **numpy**: Numerical computations (`pip install numpy`)
- **curses**: Terminal UI framework (built-in on Unix systems)
- **statistics**: Statistical calculations (built-in)
- **struct**: Binary data parsing (built-in)
- **threading**: Live capture support (built-in)
### Modern Modular File Structure
```
streamlens/
├── ethernet_analyzer_modular.py # Main entry point
├── analyzer/ # Core analysis package
│ ├── __init__.py # Package initialization
│ ├── main.py # CLI handling and main application logic
│ ├── analysis/ # Analysis engine modules
│ │ ├── __init__.py # Analysis package init
│ │ ├── core.py # EthernetAnalyzer main class
│ │ ├── flow_manager.py # FlowManager for packet processing
│ │ └── statistics.py # StatisticsEngine with sigma calculations
│ ├── models/ # Data structures and models
│ │ ├── __init__.py # Models package init
│ │ ├── flow_stats.py # FlowStats and FrameTypeStats dataclasses
│ │ └── analysis_results.py # AnalysisResult containers
│ ├── protocols/ # Protocol dissector system
│ │ ├── __init__.py # Protocols package init
│ │ ├── base.py # ProtocolDissector base classes
│ │ ├── chapter10.py # Chapter10Dissector (IRIG106)
│ │ ├── ptp.py # PTPDissector (IEEE 1588)
│ │ ├── iena.py # IENADissector (Airbus)
│ │ └── standard.py # StandardProtocolDissector
│ ├── tui/ # Text User Interface system
│ │ ├── __init__.py # TUI package init
│ │ ├── interface.py # TUIInterface main controller
│ │ ├── navigation.py # NavigationHandler for input handling
│ │ └── panels/ # UI panel components
│ │ ├── __init__.py # Panels package init
│ │ ├── flow_list.py # FlowListPanel for flow display
│ │ ├── detail_panel.py # DetailPanel for flow details
│ │ └── timeline.py # TimelinePanel for visualization
│ └── utils/ # Utility modules
│ ├── __init__.py # Utils package init
│ ├── pcap_loader.py # PCAPLoader for file handling
│ └── live_capture.py # LiveCapture for network monitoring
├── *.pcapng # Sample capture files for testing
├── README.md # User guide and quick start
└── ai.comprehensive_replay.md # This comprehensive development guide
```
## Command Line Interface
```bash
# Analyze PCAP file with TUI (flows sorted by largest sigma outliers)
python ethernet_analyzer_modular.py --pcap file.pcap
# Console output mode with sigma deviation display
python ethernet_analyzer_modular.py --pcap file.pcap --no-tui
# Generate comprehensive outlier report
python ethernet_analyzer_modular.py --pcap file.pcap --report
# Get PCAP file information only
python ethernet_analyzer_modular.py --pcap file.pcap --info
# Live capture with real-time statistics
python ethernet_analyzer_modular.py --live --interface eth0
# Configure outlier threshold (default: 3.0 sigma)
python ethernet_analyzer_modular.py --pcap file.pcap --outlier-threshold 2.0
# With BPF filtering for targeted capture
python ethernet_analyzer_modular.py --live --filter "port 319 or port 320"
```
## Key Algorithms and Techniques
### 1. Inter-arrival Time Calculation
- Track timestamps for each packet in a flow
- Calculate time differences between consecutive packets
- Handle both flow-level and frame-type-level statistics
### 2. Advanced Outlier Detection and Flow Prioritization
- Use configurable sigma rule: outliers are packets with inter-arrival times > (mean + N×std_dev)
- Default threshold: 3σ (configurable via --outlier-threshold parameter)
- Calculate maximum sigma deviation for each flow across all outliers
- Store both frame numbers and actual time deltas for detailed analysis
- Apply at both flow and frame-type granularity
- **Key Innovation**: Sort flows by largest sigma deviation to prioritize most problematic flows
```python
def get_max_sigma_deviation(self, flow: FlowStats) -> float:
"""Get the maximum sigma deviation for any outlier in this flow"""
max_sigma = 0.0
# Check flow-level outliers
if flow.outlier_details and flow.std_inter_arrival > 0:
for frame_num, inter_arrival_time in flow.outlier_details:
sigma_deviation = (inter_arrival_time - flow.avg_inter_arrival) / flow.std_inter_arrival
max_sigma = max(max_sigma, sigma_deviation)
# Check frame-type-level outliers
for frame_type, ft_stats in flow.frame_types.items():
if ft_stats.outlier_details and ft_stats.std_inter_arrival > 0:
for frame_num, inter_arrival_time in ft_stats.outlier_details:
sigma_deviation = (inter_arrival_time - ft_stats.avg_inter_arrival) / ft_stats.std_inter_arrival
max_sigma = max(max_sigma, sigma_deviation)
return max_sigma
```
### 3. Protocol Dissection Pipeline
- Layer-by-layer parsing starting with Ethernet/IP/UDP
- Specialized protocol detection based on ports, patterns, payload analysis
- Error handling and graceful degradation for malformed packets
### 4. TMATS Parameter Parsing
- Line-by-line parsing of TMATS configuration data
- Key-value pair extraction with hierarchical channel mapping
- Parameter validation and default value assignment
### 5. Live Capture Architecture with Real-time Statistics
- Threaded packet capture using scapy.sniff()
- Real-time running statistical updates during capture
- Thread-safe data structures and stop conditions
- Running averages calculated incrementally for performance
- Live outlier detection with immediate flagging
- TUI updates every 0.5 seconds during live capture
### 6. Modular Architecture Design
- **Separation of Concerns**: Clean boundaries between analysis, UI, protocols, and utilities
- **Package Structure**: Logical grouping of related functionality
- **Dependency Injection**: Components receive dependencies through constructors
- **Interface-based Design**: Protocol dissectors implement common interfaces
- **Error Handling**: Graceful degradation and comprehensive error reporting
## Extensibility Points
### Adding New Protocol Dissectors
1. Implement dissector class with `dissect(packet)` method
2. Return `DissectionResult` with protocol type and parsed fields
3. Register in `EnhancedFrameDissector.dissectors` dictionary
### Custom Frame Type Classification
- Override `_classify_frame_type()` method
- Add custom logic based on dissection results
- Return descriptive frame type strings
### Statistical Analysis Extensions
- Extend `FrameTypeStats` and `FlowStats` dataclasses
- Add custom statistical calculations in `calculate_statistics()`
- Implement additional outlier detection algorithms
## Performance Considerations
- **Memory Management**: Use generators for large PCAP files
- **Threading**: Separate capture and analysis threads for live mode
- **Statistical Efficiency**: Incremental statistics calculation
- **TUI Optimization**: Efficient screen drawing with curses error handling
## Testing and Validation
The project includes comprehensive test suites:
- **Protocol Dissection Tests**: Validate parsing accuracy
- **Statistical Analysis Tests**: Verify timing calculations
- **TUI Layout Tests**: Interface rendering validation
- **Integration Tests**: End-to-end workflow verification
This comprehensive description captures the full scope and technical depth of the Ethernet Traffic Analyzer, enabling recreation of this sophisticated telemetry analysis tool.