Files
StreamLens/ai.comprehensive_replay.md
2025-07-25 21:45:07 -04:00

20 KiB
Raw Blame History

StreamLens - Comprehensive Development Guide

Project Overview

Build a sophisticated Python-based network traffic analysis tool called "StreamLens" that analyzes both PCAP files and live network streams. The tool specializes in telemetry and avionics protocols with advanced statistical timing analysis, outlier detection, and sigma-based flow prioritization. Features a modern modular architecture with both a text-based user interface (TUI) and a professional PySide6 GUI with interactive matplotlib signal visualization.

Core Requirements

Primary Functionality

  • PCAP Analysis: Load and analyze packet capture files with file validation and info extraction
  • Live Capture: Real-time network traffic monitoring with threading support
  • Flow Analysis: Group packets by source-destination IP pairs with comprehensive statistics
  • Advanced Statistical Analysis: Configurable sigma thresholds, running averages, coefficient of variation
  • Sigma-Based Outlier Detection: Identify packets with timing deviations using configurable thresholds (default 3σ)
  • Flow Prioritization: Automatically sort flows by largest sigma deviation for efficient analysis
  • Interactive TUI: Three-panel interface with real-time updates and navigation
  • Modern GUI Interface: Professional PySide6-based GUI with embedded matplotlib plots
  • Modular Architecture: Clean separation of concerns with analyzers, models, protocols, TUI, GUI, and utilities

Advanced Features

  • Specialized Protocol Support: Chapter 10 (IRIG106), PTP (IEEE 1588), IENA (Airbus)
  • Frame Type Classification: Hierarchical breakdown by protocol and frame types with per-type statistics
  • Timeline Visualization: ASCII-based timing deviation charts with dynamic scaling
  • Real-time Statistics: Running averages and outlier detection during live capture
  • Comprehensive Reporting: Detailed outlier reports with sigma deviation calculations
  • High Jitter Detection: Coefficient of variation analysis for identifying problematic flows
  • Configurable Analysis: Adjustable outlier thresholds and analysis parameters
  • Chapter 10 Signal Visualization: Real-time matplotlib-based signal plotting with TMATS integration
  • Interactive Signal Analysis: Press 'v' in TUI to generate signal files, or use GUI for embedded interactive plots
  • Threading-Safe Visualization: Proper Qt integration for GUI, file output for TUI to prevent segmentation faults
  • Cross-Platform GUI: PySide6-based interface with file dialogs, progress bars, and embedded matplotlib widgets

Architecture Overview

Core Components

1. Data Structures (dataclasses)

@dataclass
class FrameTypeStats:
    frame_type: str
    count: int
    total_bytes: int
    timestamps: List[float]
    frame_numbers: List[int]
    inter_arrival_times: List[float]
    avg_inter_arrival: float
    std_inter_arrival: float
    outlier_frames: List[int]
    outlier_details: List[Tuple[int, float]]

@dataclass
class FlowStats:
    src_ip: str
    dst_ip: str
    frame_count: int
    timestamps: List[float]
    frame_numbers: List[int]
    inter_arrival_times: List[float]
    avg_inter_arrival: float
    std_inter_arrival: float
    outlier_frames: List[int]
    outlier_details: List[Tuple[int, float]]
    total_bytes: int
    protocols: Set[str]
    detected_protocol_types: Set[str]
    frame_types: Dict[str, FrameTypeStats]

2. Main Analysis Engine (EthernetAnalyzer)

  • Packet Processing: Single-packet and batch processing capabilities
  • Flow Management: Track unique source-destination pairs
  • Protocol Detection: Multi-layer protocol identification
  • Statistical Calculation: Mean, standard deviation, outlier detection using 2σ threshold
  • Live Capture: Threading support for real-time analysis

3. Enhanced Frame Dissection System

Base Dissector Interface
class EnhancedFrameDissector:
    def dissect_frame(self, packet: Packet, frame_num: int) -> Dict[str, Any]
Specialized Protocol Dissectors

Chapter 10 Dissector (Chapter10Dissector):

  • IRIG106-17 telemetry standard support
  • Sync pattern detection (0xEB25)
  • Header parsing: channel ID, data types, timing counters
  • Data format support: Analog, PCM, Ethernet Format 0
  • Container format handling (embedded Ch10 in other protocols)

PTP Dissector (PTPDissector):

  • IEEE 1588-2019 Precision Time Protocol
  • Message types: Sync, Delay_Req, Announce, etc.
  • Timestamp extraction and correction field parsing
  • Grandmaster clock quality analysis

IENA Dissector (IENADissector):

  • Airbus Improved Ethernet Network Architecture
  • Packet types: P-type, D-type, N-type, M-type, Q-type
  • Parameter ID extraction and delay analysis
  • Variable-length message parsing

4. Chapter 10 Packet System (chapter10_packet.py)

Core Packet Class
class Chapter10Packet:
    def __init__(self, packet, original_frame_num: Optional[int] = None)
    def _parse_ch10_header(self) -> Optional[Dict]
    def get_data_payload(self) -> Optional[bytes]
    def decode_data(self, tmats_scaling_dict, tmats_content) -> Optional[DecodedData]
Data Decoders
  • AnalogDecoder: Multi-channel analog data with scaling
  • PCMDecoder: Pulse Code Modulation format
  • AnalogTimeFormatDecoder: Time-synchronized analog data with TMATS integration
TMATS Integration
  • Scaling parameter extraction from TMATS content
  • Channel configuration parsing
  • Engineering unit conversion
  • Gain/offset/full-scale calculations

5. Text User Interface (TUIInterface)

Three-Panel Layout
┌─ FLOWS (Left 60%) ──┐│ DETAILS (Right 40%) ─┐
│ IP flows + frame    ││ Flow info + table    │
│ type breakdowns     ││ Frame type details   │
├────────────────────── ────────────────────────┤
│ TIMING VISUALIZATION (Bottom 30%)            │
│ ASCII timeline with deviation plotting       │
└───────────────────────────────────────────────┘
Key Interface Features
  • Flow List: Hierarchical display of flows and frame types
  • Detail Panel:
    • Flow summary (packets, bytes, protocols)
    • Frame type table: Type | #Pkts | Bytes | Avg ΔT | 2σ Out
    • Timing statistics and threshold calculations
  • Timeline Panel: ASCII deviation chart with scaling
  • Navigation: Arrow keys, view switching (main/dissection)

Technical Implementation Details

Statistical Analysis Algorithm

def calculate_statistics(self):
    for flow in self.flows.values():
        # Calculate inter-arrival times
        flow.avg_inter_arrival = statistics.mean(flow.inter_arrival_times)
        flow.std_inter_arrival = statistics.stdev(flow.inter_arrival_times)
        
        # Detect outliers (>2σ threshold)
        threshold = flow.avg_inter_arrival + (2 * flow.std_inter_arrival)
        for i, inter_time in enumerate(flow.inter_arrival_times):
            if inter_time > threshold:
                frame_number = flow.frame_numbers[i + 1]
                flow.outlier_frames.append(frame_number)
                flow.outlier_details.append((frame_number, inter_time))

Protocol Classification Hierarchy

  1. Specialized Protocols: Chapter 10, PTP, IENA (highest priority)
  2. Port-based Detection: DNS (53), HTTP (80), NTP (123), etc.
  3. Transport Layer: TCP, UDP, ICMP, IGMP
  4. Fallback: Generic classification

Frame Type Classification Logic

def _classify_frame_type(self, packet, dissection):
    layers = dissection.get('layers', {})
    
    # Chapter 10 takes precedence
    if 'chapter10' in layers:
        if self._is_tmats_frame(packet, layers['chapter10']):
            return 'TMATS'
        else:
            return 'CH10-Data'
    
    # PTP message types
    if 'ptp' in layers:
        msg_type = layers['ptp'].get('message_type_name', 'Unknown')
        return f'PTP-{msg_type}'
    
    # IENA packet types  
    if 'iena' in layers:
        packet_type = layers['iena'].get('packet_type_name', 'Unknown')
        return f'IENA-{packet_type}'
    
    # Fallback to transport/application protocols
    return self._classify_standard_protocols(packet)

Timeline Visualization Algorithm

  • Deviation Calculation: deviation = inter_arrival_time - average_inter_arrival
  • Scaling: Vertical position based on deviation magnitude
  • Character Selection:
    • · for normal timing (within 10% of average)
    • / for moderate deviations (within 50%)
    • / for significant outliers
  • Time Mapping: Horizontal position maps to actual timestamps

Dependencies and Requirements

Core Libraries

  • scapy: Packet capture and parsing (pip install scapy)
  • numpy: Numerical computations (pip install numpy)
  • matplotlib: Signal visualization and plotting (pip install matplotlib)
  • PySide6: Modern Qt-based GUI framework (pip install PySide6)
  • tkinter: GUI backend for matplotlib (usually included with Python)
  • curses: Terminal UI framework (built-in on Unix systems)
  • statistics: Statistical calculations (built-in)
  • struct: Binary data parsing (built-in)
  • threading: Live capture support (built-in)

Modern Modular File Structure

streamlens/
├── streamlens.py                    # Main entry point
├── analyzer/                        # Core analysis package
│   ├── __init__.py                  # Package initialization
│   ├── main.py                     # CLI handling and main application logic
│   ├── analysis/                   # Analysis engine modules
│   │   ├── __init__.py            # Analysis package init
│   │   ├── core.py                # EthernetAnalyzer main class
│   │   ├── flow_manager.py        # FlowManager for packet processing
│   │   └── statistics.py          # StatisticsEngine with sigma calculations
│   ├── models/                     # Data structures and models
│   │   ├── __init__.py            # Models package init
│   │   ├── flow_stats.py          # FlowStats and FrameTypeStats dataclasses
│   │   └── analysis_results.py    # AnalysisResult containers
│   ├── protocols/                  # Protocol dissector system
│   │   ├── __init__.py            # Protocols package init
│   │   ├── base.py                # ProtocolDissector base classes
│   │   ├── chapter10.py           # Chapter10Dissector (IRIG106)
│   │   ├── ptp.py                 # PTPDissector (IEEE 1588)
│   │   ├── iena.py                # IENADissector (Airbus)
│   │   └── standard.py            # StandardProtocolDissector
│   ├── gui/                        # Modern GUI Interface system (NEW!)
│   │   ├── __init__.py            # GUI package init
│   │   └── main_window.py         # StreamLensMainWindow with PySide6 and matplotlib
│   ├── tui/                        # Text User Interface system
│   │   ├── __init__.py            # TUI package init
│   │   ├── interface.py           # TUIInterface main controller
│   │   ├── navigation.py          # NavigationHandler for input handling
│   │   └── panels/                # UI panel components
│   │       ├── __init__.py        # Panels package init
│   │       ├── flow_list.py       # FlowListPanel for flow display
│   │       ├── detail_panel.py    # DetailPanel for flow details
│   │       └── timeline.py        # TimelinePanel for visualization
│   └── utils/                      # Utility modules
│       ├── __init__.py            # Utils package init
│       ├── pcap_loader.py         # PCAPLoader for file handling
│       ├── live_capture.py        # LiveCapture for network monitoring
│       └── signal_visualizer.py   # Chapter 10 signal visualization (thread-safe)
├── *.pcapng                        # Sample capture files for testing
├── README.md                       # User guide and quick start
└── ai.comprehensive_replay.md      # This comprehensive development guide

Command Line Interface

# Launch modern GUI with interactive plots (RECOMMENDED)
python streamlens.py --gui --pcap file.pcap

# GUI mode only (then open file via File menu)
python streamlens.py --gui

# Analyze PCAP file with TUI (flows sorted by largest sigma outliers)
python streamlens.py --pcap file.pcap

# Console output mode with sigma deviation display
python streamlens.py --pcap file.pcap --no-tui

# Generate comprehensive outlier report
python streamlens.py --pcap file.pcap --report

# Get PCAP file information only
python streamlens.py --pcap file.pcap --info

# Live capture with real-time statistics
python streamlens.py --live --interface eth0

# Configure outlier threshold (default: 3.0 sigma)
python streamlens.py --pcap file.pcap --outlier-threshold 2.0

# With BPF filtering for targeted capture
python streamlens.py --live --filter "port 319 or port 320"

Key Algorithms and Techniques

1. Inter-arrival Time Calculation

  • Track timestamps for each packet in a flow
  • Calculate time differences between consecutive packets
  • Handle both flow-level and frame-type-level statistics

2. Advanced Outlier Detection and Flow Prioritization

  • Use configurable sigma rule: outliers are packets with inter-arrival times > (mean + N×std_dev)
  • Default threshold: 3σ (configurable via --outlier-threshold parameter)
  • Calculate maximum sigma deviation for each flow across all outliers
  • Store both frame numbers and actual time deltas for detailed analysis
  • Apply at both flow and frame-type granularity
  • Key Innovation: Sort flows by largest sigma deviation to prioritize most problematic flows
def get_max_sigma_deviation(self, flow: FlowStats) -> float:
    """Get the maximum sigma deviation for any outlier in this flow"""
    max_sigma = 0.0
    
    # Check flow-level outliers
    if flow.outlier_details and flow.std_inter_arrival > 0:
        for frame_num, inter_arrival_time in flow.outlier_details:
            sigma_deviation = (inter_arrival_time - flow.avg_inter_arrival) / flow.std_inter_arrival
            max_sigma = max(max_sigma, sigma_deviation)
    
    # Check frame-type-level outliers
    for frame_type, ft_stats in flow.frame_types.items():
        if ft_stats.outlier_details and ft_stats.std_inter_arrival > 0:
            for frame_num, inter_arrival_time in ft_stats.outlier_details:
                sigma_deviation = (inter_arrival_time - ft_stats.avg_inter_arrival) / ft_stats.std_inter_arrival
                max_sigma = max(max_sigma, sigma_deviation)
    
    return max_sigma

3. Protocol Dissection Pipeline

  • Layer-by-layer parsing starting with Ethernet/IP/UDP
  • Specialized protocol detection based on ports, patterns, payload analysis
  • Error handling and graceful degradation for malformed packets

4. TMATS Parameter Parsing

  • Line-by-line parsing of TMATS configuration data
  • Key-value pair extraction with hierarchical channel mapping
  • Parameter validation and default value assignment

5. Live Capture Architecture with Real-time Statistics

  • Threaded packet capture using scapy.sniff()
  • Real-time running statistical updates during capture
  • Thread-safe data structures and stop conditions
  • Running averages calculated incrementally for performance
  • Live outlier detection with immediate flagging
  • TUI updates every 0.5 seconds during live capture

6. Chapter 10 Signal Visualization System

  • TMATS Parser: Extracts channel metadata from Telemetry Attributes Transfer Standard frames
  • Signal Decoders: Support for analog and PCM format data with proper scaling
  • Matplotlib Integration: External plotting windows with interactive capabilities
  • Real-time Visualization: Works for both PCAP analysis and live capture modes
  • Multi-channel Display: Simultaneous plotting of multiple signal channels with engineering units
class SignalVisualizer:
    def visualize_flow_signals(self, flow: FlowStats, packets: List[Packet]) -> None:
        # Extract TMATS metadata for channel configurations
        tmats_metadata = self._extract_tmats_from_flow(packets)
        
        # Decode signal data using Chapter 10 decoders
        signal_data = self._extract_signals_from_flow(packets, tmats_metadata)
        
        # Launch matplotlib window in background thread
        self._create_signal_window(flow_key, signal_data, flow)

7. PySide6 GUI Architecture with Threading Safety

  • Professional Qt Interface: Cross-platform GUI built with PySide6 for native look and feel
  • Embedded Matplotlib Integration: Interactive plots with zoom, pan, and navigation toolbar
  • Background Processing: Threading for PCAP loading with progress bar and non-blocking UI
  • Flow List Widget: Sortable table with sigma deviations, protocols, and frame types
  • Signal Visualization: Click-to-visualize Chapter 10 flows with embedded matplotlib widgets
  • Threading Safety: Proper Qt integration prevents matplotlib segmentation faults
class StreamLensMainWindow(QMainWindow):
    def __init__(self):
        # Create main interface with flow list and plot area
        self.flows_table = QTableWidget()  # Sortable flow list
        self.plot_widget = PlotWidget()    # Embedded matplotlib
        
    def load_pcap_file(self, file_path: str):
        # Background loading with progress bar
        self.loading_thread = PCAPLoadThread(file_path)
        self.loading_thread.progress_updated.connect(self.progress_bar.setValue)
        self.loading_thread.loading_finished.connect(self.on_pcap_loaded)
        
    def visualize_selected_flow(self):
        # Interactive signal visualization
        signal_data = signal_visualizer._extract_signals_from_flow(packets, tmats)
        self.plot_widget.plot_flow_signals(flow, signal_data, flow_key)

8. Modular Architecture Design

  • Separation of Concerns: Clean boundaries between analysis, UI, protocols, and utilities
  • Package Structure: Logical grouping of related functionality
  • Dependency Injection: Components receive dependencies through constructors
  • Interface-based Design: Protocol dissectors implement common interfaces
  • Error Handling: Graceful degradation and comprehensive error reporting

Extensibility Points

Adding New Protocol Dissectors

  1. Implement dissector class with dissect(packet) method
  2. Return DissectionResult with protocol type and parsed fields
  3. Register in EnhancedFrameDissector.dissectors dictionary

Custom Frame Type Classification

  • Override _classify_frame_type() method
  • Add custom logic based on dissection results
  • Return descriptive frame type strings

Statistical Analysis Extensions

  • Extend FrameTypeStats and FlowStats dataclasses
  • Add custom statistical calculations in calculate_statistics()
  • Implement additional outlier detection algorithms

Performance Considerations

  • Memory Management: Use generators for large PCAP files
  • Threading: Separate capture and analysis threads for live mode
  • Statistical Efficiency: Incremental statistics calculation
  • TUI Optimization: Efficient screen drawing with curses error handling

Testing and Validation

The project includes comprehensive test suites:

  • Protocol Dissection Tests: Validate parsing accuracy
  • Statistical Analysis Tests: Verify timing calculations
  • TUI Layout Tests: Interface rendering validation
  • Integration Tests: End-to-end workflow verification

This comprehensive description captures the full scope and technical depth of the Ethernet Traffic Analyzer, enabling recreation of this sophisticated telemetry analysis tool.