first working
This commit is contained in:
146
pyshark_poc/README.md
Normal file
146
pyshark_poc/README.md
Normal file
@@ -0,0 +1,146 @@
|
||||
# PyShark Proof of Concept for Airstream
|
||||
|
||||
This is an alternative implementation of Airstream using PyShark instead of Scapy. PyShark leverages Wireshark's powerful dissector engine for comprehensive protocol support.
|
||||
|
||||
## Key Advantages
|
||||
|
||||
### 1. **Full Wireshark Protocol Support**
|
||||
- Automatically uses ALL installed Wireshark dissectors
|
||||
- Supports 2000+ protocols out of the box
|
||||
- Better decoding for complex protocols (PTP, IENA, Chapter 10)
|
||||
|
||||
### 2. **Custom Dissector Support**
|
||||
- Any Lua dissector installed in Wireshark works automatically
|
||||
- See `lua_dissectors/` for examples
|
||||
- No code changes needed to support new protocols
|
||||
|
||||
### 3. **Advanced Filtering**
|
||||
- Full Wireshark display filter syntax
|
||||
- BPF capture filters for performance
|
||||
- Protocol-specific field access
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
# Install PyShark (requires tshark/Wireshark)
|
||||
pip install pyshark
|
||||
|
||||
# On macOS
|
||||
brew install wireshark
|
||||
|
||||
# On Ubuntu/Debian
|
||||
sudo apt-get install tshark
|
||||
|
||||
# On RHEL/CentOS
|
||||
sudo yum install wireshark
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
# Basic PCAP analysis
|
||||
./airstream_pyshark.py -p capture.pcap
|
||||
|
||||
# Live capture with filter
|
||||
./airstream_pyshark.py -i eth0 -c 1000 --filter "tcp.port==443"
|
||||
|
||||
# Use BPF filter for efficient capture
|
||||
./airstream_pyshark.py -i eth0 --bpf "port 80 or port 443"
|
||||
|
||||
# Export results to CSV
|
||||
./airstream_pyshark.py -p capture.pcap -o results.csv
|
||||
|
||||
# Use PTP-specific statistics
|
||||
./airstream_pyshark.py -p ptp_traffic.pcap -s ptp
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
airstream_pyshark.py # Main entry point (CLI)
|
||||
pyshark_poc/
|
||||
├── __init__.py # Package initialization
|
||||
├── analyzer.py # PySharkAnalyzer class
|
||||
├── models.py # Data models (FlowKey)
|
||||
├── stats.py # Statistics classes
|
||||
└── README.md # This file
|
||||
|
||||
lua_dissectors/ # Custom Wireshark dissectors
|
||||
├── example_custom_protocol.lua
|
||||
└── README.md
|
||||
```
|
||||
|
||||
## Performance Comparison
|
||||
|
||||
| Aspect | Scapy | PyShark |
|
||||
|--------|-------|---------|
|
||||
| Packet Parsing Speed | Faster | Slower (XML overhead) |
|
||||
| Protocol Support | Limited | Comprehensive |
|
||||
| Custom Dissectors | Python only | Lua + C |
|
||||
| Memory Usage | Lower | Higher |
|
||||
| Dependencies | Python only | Requires tshark |
|
||||
|
||||
## When to Use PyShark
|
||||
|
||||
✅ **Use PyShark when:**
|
||||
- You need comprehensive protocol decoding
|
||||
- Working with proprietary protocols
|
||||
- Need Wireshark's advanced filtering
|
||||
- Protocol accuracy is critical
|
||||
|
||||
❌ **Use Scapy when:**
|
||||
- Performance is critical
|
||||
- Need packet crafting/modification
|
||||
- Minimal dependencies required
|
||||
- Simple protocol analysis
|
||||
|
||||
## Custom Protocol Support
|
||||
|
||||
To add custom protocol support:
|
||||
|
||||
1. Create a Lua dissector (see `lua_dissectors/example_custom_protocol.lua`)
|
||||
2. Install in Wireshark plugins directory
|
||||
3. PyShark automatically uses it
|
||||
|
||||
Example accessing custom fields:
|
||||
```python
|
||||
# After installing custom dissector
|
||||
capture = pyshark.FileCapture('custom_protocol.pcap')
|
||||
for packet in capture:
|
||||
if hasattr(packet, 'custom'):
|
||||
print(f"Message type: {packet.custom.msg_type}")
|
||||
print(f"Sequence: {packet.custom.sequence}")
|
||||
```
|
||||
|
||||
## Limitations
|
||||
|
||||
1. **Performance**: Slower than Scapy due to XML parsing overhead
|
||||
2. **Dependencies**: Requires Wireshark/tshark installation
|
||||
3. **Read-only**: Cannot modify or craft packets
|
||||
4. **Platform-specific**: tshark paths may vary
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
- [ ] Parallel packet processing
|
||||
- [ ] Caching for improved performance
|
||||
- [ ] Integration with existing frametypes
|
||||
- [ ] Protocol-specific analyzers
|
||||
- [ ] Real-time streaming analysis
|
||||
- [ ] Custom field extractors
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
# Test with sample PCAP
|
||||
./airstream_pyshark.py -p "1 PTPGM.pcapng"
|
||||
|
||||
# List available interfaces
|
||||
./airstream_pyshark.py -l
|
||||
|
||||
# Verbose mode for debugging
|
||||
./airstream_pyshark.py -p capture.pcap -v
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
This PyShark implementation provides a powerful alternative when comprehensive protocol support is needed. While it trades performance for functionality, it enables analysis of complex protocols that would be difficult to implement in pure Python.
|
||||
20
pyshark_poc/__init__.py
Normal file
20
pyshark_poc/__init__.py
Normal file
@@ -0,0 +1,20 @@
|
||||
"""
|
||||
PyShark-based proof of concept for Airstream packet analyzer.
|
||||
|
||||
This module provides an alternative implementation using PyShark
|
||||
to leverage Wireshark's dissector capabilities.
|
||||
"""
|
||||
|
||||
from .analyzer import PySharkAnalyzer
|
||||
from .models import FlowKey
|
||||
from .stats import MultiStats, BaseStats, OverviewStats, PTPStats, STATS_TYPES
|
||||
|
||||
__all__ = [
|
||||
'PySharkAnalyzer',
|
||||
'FlowKey',
|
||||
'MultiStats',
|
||||
'BaseStats',
|
||||
'OverviewStats',
|
||||
'PTPStats',
|
||||
'STATS_TYPES'
|
||||
]
|
||||
BIN
pyshark_poc/__pycache__/__init__.cpython-313.pyc
Normal file
BIN
pyshark_poc/__pycache__/__init__.cpython-313.pyc
Normal file
Binary file not shown.
BIN
pyshark_poc/__pycache__/analyzer.cpython-313.pyc
Normal file
BIN
pyshark_poc/__pycache__/analyzer.cpython-313.pyc
Normal file
Binary file not shown.
BIN
pyshark_poc/__pycache__/models.cpython-313.pyc
Normal file
BIN
pyshark_poc/__pycache__/models.cpython-313.pyc
Normal file
Binary file not shown.
BIN
pyshark_poc/__pycache__/stats.cpython-313.pyc
Normal file
BIN
pyshark_poc/__pycache__/stats.cpython-313.pyc
Normal file
Binary file not shown.
187
pyshark_poc/analyzer.py
Normal file
187
pyshark_poc/analyzer.py
Normal file
@@ -0,0 +1,187 @@
|
||||
import pyshark
|
||||
from collections import defaultdict
|
||||
from typing import Optional, List, Type, Union
|
||||
import pandas as pd
|
||||
from tabulate import tabulate
|
||||
|
||||
from .models import FlowKey
|
||||
from .stats import MultiStats, BaseStats, STATS_TYPES
|
||||
|
||||
|
||||
class PySharkAnalyzer:
|
||||
"""Packet flow analyzer using PyShark for Wireshark dissector support."""
|
||||
|
||||
def __init__(self, stats_classes: Optional[List[Type[BaseStats]]] = None):
|
||||
if stats_classes is None:
|
||||
stats_classes = [STATS_TYPES['overview']]
|
||||
self.stats_classes = stats_classes
|
||||
self.flows = defaultdict(lambda: MultiStats(stats_classes))
|
||||
self.packet_count = 0
|
||||
|
||||
def _get_flow_key(self, packet) -> Optional[FlowKey]:
|
||||
"""Extract flow key from PyShark packet."""
|
||||
try:
|
||||
# Check for IP layer
|
||||
if not hasattr(packet, 'ip'):
|
||||
return None
|
||||
|
||||
src_ip = packet.ip.src
|
||||
dst_ip = packet.ip.dst
|
||||
protocol = packet.transport_layer if hasattr(packet, 'transport_layer') else 'IP'
|
||||
|
||||
# Get ports based on protocol
|
||||
src_port = 0
|
||||
dst_port = 0
|
||||
|
||||
if hasattr(packet, 'tcp'):
|
||||
src_port = int(packet.tcp.srcport)
|
||||
dst_port = int(packet.tcp.dstport)
|
||||
protocol = 'TCP'
|
||||
elif hasattr(packet, 'udp'):
|
||||
src_port = int(packet.udp.srcport)
|
||||
dst_port = int(packet.udp.dstport)
|
||||
protocol = 'UDP'
|
||||
|
||||
# Check for extended protocol types
|
||||
extended_type = None
|
||||
if hasattr(packet, 'ptp'):
|
||||
extended_type = 'PTP'
|
||||
# Add more protocol detection here as needed
|
||||
|
||||
return FlowKey(src_ip, src_port, dst_ip, dst_port, protocol, extended_type)
|
||||
|
||||
except AttributeError:
|
||||
return None
|
||||
|
||||
def _process_packet(self, packet):
|
||||
"""Process a single packet."""
|
||||
key = self._get_flow_key(packet)
|
||||
if key:
|
||||
# Get timestamp and size
|
||||
timestamp = float(packet.sniff_timestamp) if hasattr(packet, 'sniff_timestamp') else 0
|
||||
size = int(packet.length) if hasattr(packet, 'length') else 0
|
||||
|
||||
self.flows[key].add(timestamp, size, packet)
|
||||
self.packet_count += 1
|
||||
|
||||
def analyze_pcap(self, file: str, display_filter: Optional[str] = None):
|
||||
"""Analyze packets from a PCAP file."""
|
||||
print(f"Analyzing: {file}")
|
||||
if display_filter:
|
||||
print(f"Filter: {display_filter}")
|
||||
|
||||
try:
|
||||
# Use FileCapture for PCAP files
|
||||
capture = pyshark.FileCapture(
|
||||
file,
|
||||
display_filter=display_filter,
|
||||
use_json=True, # Use JSON output for better performance
|
||||
include_raw=False # Don't include raw packet data
|
||||
)
|
||||
|
||||
# Process packets
|
||||
for packet in capture:
|
||||
self._process_packet(packet)
|
||||
# Show progress every 1000 packets
|
||||
if self.packet_count % 1000 == 0:
|
||||
print(f" Processed {self.packet_count} packets...")
|
||||
|
||||
capture.close()
|
||||
print(f"Found {len(self.flows)} flows from {self.packet_count} packets")
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error analyzing PCAP: {e}")
|
||||
|
||||
def analyze_live(self, interface: str, count: int = 100,
|
||||
display_filter: Optional[str] = None,
|
||||
bpf_filter: Optional[str] = None):
|
||||
"""Capture and analyze packets from a live interface."""
|
||||
print(f"Capturing {count} packets on {interface}")
|
||||
if display_filter:
|
||||
print(f"Display filter: {display_filter}")
|
||||
if bpf_filter:
|
||||
print(f"BPF filter: {bpf_filter}")
|
||||
|
||||
try:
|
||||
# Use LiveCapture for live capture
|
||||
capture = pyshark.LiveCapture(
|
||||
interface=interface,
|
||||
display_filter=display_filter,
|
||||
bpf_filter=bpf_filter,
|
||||
use_json=True,
|
||||
include_raw=False
|
||||
)
|
||||
|
||||
# Capture packets
|
||||
capture.sniff(packet_count=count)
|
||||
|
||||
# Process captured packets
|
||||
for packet in capture:
|
||||
self._process_packet(packet)
|
||||
|
||||
capture.close()
|
||||
print(f"Found {len(self.flows)} flows from {self.packet_count} packets")
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error during live capture: {e}")
|
||||
|
||||
def summary(self) -> pd.DataFrame:
|
||||
"""Generate summary DataFrame of all flows."""
|
||||
rows = []
|
||||
for key, multi_stats in self.flows.items():
|
||||
row = {
|
||||
'Src IP': key.src_ip,
|
||||
'Src Port': key.src_port,
|
||||
'Dst IP': key.dst_ip,
|
||||
'Dst Port': key.dst_port,
|
||||
'Proto': key.protocol
|
||||
}
|
||||
if key.extended_type:
|
||||
row['Type'] = key.extended_type
|
||||
row.update(multi_stats.get_combined_summary())
|
||||
rows.append(row)
|
||||
|
||||
# Sort by packet count descending
|
||||
df = pd.DataFrame(rows)
|
||||
if not df.empty and 'Pkts' in df.columns:
|
||||
df = df.sort_values('Pkts', ascending=False)
|
||||
return df
|
||||
|
||||
def print_summary(self):
|
||||
"""Print formatted summary of flows."""
|
||||
df = self.summary()
|
||||
if df.empty:
|
||||
print("No flows detected")
|
||||
return
|
||||
|
||||
print(f"\n{len(df)} flows:")
|
||||
print(tabulate(df, headers='keys', tablefmt='plain', showindex=False))
|
||||
|
||||
if 'Pkts' in df.columns and 'Bytes' in df.columns:
|
||||
print(f"\nTotals: {df['Pkts'].sum()} packets, {df['Bytes'].sum()} bytes")
|
||||
|
||||
def get_protocol_summary(self) -> pd.DataFrame:
|
||||
"""Get summary grouped by protocol."""
|
||||
df = self.summary()
|
||||
if df.empty:
|
||||
return df
|
||||
|
||||
# Group by protocol
|
||||
protocol_summary = df.groupby('Proto').agg({
|
||||
'Pkts': 'sum',
|
||||
'Bytes': 'sum'
|
||||
}).reset_index()
|
||||
|
||||
return protocol_summary
|
||||
|
||||
def apply_wireshark_filter(self, display_filter: str):
|
||||
"""
|
||||
Apply a Wireshark display filter to the analysis.
|
||||
This demonstrates PyShark's ability to use Wireshark's filtering.
|
||||
"""
|
||||
filtered_flows = defaultdict(lambda: MultiStats(self.stats_classes))
|
||||
|
||||
# This would require re-processing with the filter
|
||||
# Shown here as an example of the capability
|
||||
print(f"Note: To apply Wireshark filters, re-analyze with display_filter parameter")
|
||||
return filtered_flows
|
||||
13
pyshark_poc/models.py
Normal file
13
pyshark_poc/models.py
Normal file
@@ -0,0 +1,13 @@
|
||||
from dataclasses import dataclass
|
||||
from typing import Optional
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class FlowKey:
|
||||
"""Flow identifier for network traffic analysis."""
|
||||
src_ip: str
|
||||
src_port: int
|
||||
dst_ip: str
|
||||
dst_port: int
|
||||
protocol: str
|
||||
extended_type: Optional[str] = None # For extended frame types like IENA, Chapter 10, etc.
|
||||
145
pyshark_poc/stats.py
Normal file
145
pyshark_poc/stats.py
Normal file
@@ -0,0 +1,145 @@
|
||||
from typing import Dict, List, Any, Type, Optional
|
||||
import time
|
||||
|
||||
|
||||
class BaseStats:
|
||||
"""Base statistics class for packet flow analysis."""
|
||||
|
||||
def __init__(self):
|
||||
self.packet_count = 0
|
||||
self.byte_count = 0
|
||||
self.first_timestamp = None
|
||||
self.last_timestamp = None
|
||||
self.packet_sizes = []
|
||||
self.inter_arrival_times = []
|
||||
self.last_packet_time = None
|
||||
|
||||
def add(self, timestamp: float, size: int, packet: Any):
|
||||
"""Add a packet to statistics."""
|
||||
self.packet_count += 1
|
||||
self.byte_count += size
|
||||
self.packet_sizes.append(size)
|
||||
|
||||
if self.first_timestamp is None:
|
||||
self.first_timestamp = timestamp
|
||||
self.last_timestamp = timestamp
|
||||
|
||||
if self.last_packet_time is not None:
|
||||
delta = timestamp - self.last_packet_time
|
||||
self.inter_arrival_times.append(delta)
|
||||
self.last_packet_time = timestamp
|
||||
|
||||
# Call protocol-specific handler
|
||||
self._process_packet(packet)
|
||||
|
||||
def _process_packet(self, packet: Any):
|
||||
"""Override in subclasses for protocol-specific processing."""
|
||||
pass
|
||||
|
||||
def get_summary_dict(self) -> Dict[str, Any]:
|
||||
"""Get summary statistics as dictionary."""
|
||||
duration = (self.last_timestamp - self.first_timestamp) if self.first_timestamp and self.last_timestamp else 0
|
||||
|
||||
summary = {
|
||||
'Pkts': self.packet_count,
|
||||
'Bytes': self.byte_count,
|
||||
'Duration': round(duration, 3) if duration else 0,
|
||||
'Avg Size': round(sum(self.packet_sizes) / len(self.packet_sizes), 1) if self.packet_sizes else 0,
|
||||
}
|
||||
|
||||
if self.inter_arrival_times:
|
||||
avg_delta = sum(self.inter_arrival_times) / len(self.inter_arrival_times)
|
||||
summary['Avg TimeΔ'] = round(avg_delta, 6)
|
||||
|
||||
# Calculate standard deviation
|
||||
if len(self.inter_arrival_times) > 1:
|
||||
mean = avg_delta
|
||||
variance = sum((x - mean) ** 2 for x in self.inter_arrival_times) / len(self.inter_arrival_times)
|
||||
std_dev = variance ** 0.5
|
||||
summary['Time 1σ'] = round(std_dev, 6)
|
||||
else:
|
||||
summary['Time 1σ'] = 0
|
||||
else:
|
||||
summary['Avg TimeΔ'] = 0
|
||||
summary['Time 1σ'] = 0
|
||||
|
||||
if duration > 0:
|
||||
summary['Pkt/s'] = round(self.packet_count / duration, 1)
|
||||
summary['B/s'] = round(self.byte_count / duration, 1)
|
||||
else:
|
||||
summary['Pkt/s'] = 0
|
||||
summary['B/s'] = 0
|
||||
|
||||
return summary
|
||||
|
||||
|
||||
class OverviewStats(BaseStats):
|
||||
"""Overview statistics for general packet analysis."""
|
||||
pass
|
||||
|
||||
|
||||
class PTPStats(BaseStats):
|
||||
"""PTP-specific statistics."""
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.ptp_message_types = {}
|
||||
|
||||
def _process_packet(self, packet: Any):
|
||||
"""Process PTP-specific packet information."""
|
||||
# Check if packet has PTP layer
|
||||
if hasattr(packet, 'ptp'):
|
||||
try:
|
||||
msg_type = packet.ptp.v2_messagetype if hasattr(packet.ptp, 'v2_messagetype') else 'unknown'
|
||||
self.ptp_message_types[msg_type] = self.ptp_message_types.get(msg_type, 0) + 1
|
||||
except:
|
||||
pass
|
||||
|
||||
def get_summary_dict(self) -> Dict[str, Any]:
|
||||
summary = super().get_summary_dict()
|
||||
# Add PTP-specific metrics
|
||||
if self.ptp_message_types:
|
||||
summary['PTP Types'] = str(self.ptp_message_types)
|
||||
return summary
|
||||
|
||||
|
||||
class MultiStats:
|
||||
"""Container for multiple stats instances."""
|
||||
|
||||
def __init__(self, stats_classes: Optional[List[Type[BaseStats]]] = None):
|
||||
if stats_classes is None:
|
||||
stats_classes = [OverviewStats]
|
||||
self.stats_instances = [cls() for cls in stats_classes]
|
||||
self.stats_classes = stats_classes
|
||||
|
||||
def add(self, timestamp: float, size: int, packet: Any):
|
||||
"""Add packet to all stats instances."""
|
||||
for stats in self.stats_instances:
|
||||
stats.add(timestamp, size, packet)
|
||||
|
||||
def get_combined_summary(self) -> Dict[str, Any]:
|
||||
"""Combine summaries from all stats instances."""
|
||||
combined = {}
|
||||
for i, stats in enumerate(self.stats_instances):
|
||||
summary = stats.get_summary_dict()
|
||||
class_name = self.stats_classes[i].__name__.replace('Stats', '')
|
||||
|
||||
# Add prefix to avoid column name conflicts
|
||||
for key, value in summary.items():
|
||||
if key in ['Pkts', 'Bytes', 'Duration']: # Common columns
|
||||
if i == 0: # Only include once
|
||||
combined[key] = value
|
||||
else:
|
||||
# Only add prefix if we have multiple stats classes
|
||||
if len(self.stats_classes) > 1:
|
||||
combined[f"{class_name}:{key}"] = value
|
||||
else:
|
||||
combined[key] = value
|
||||
return combined
|
||||
|
||||
|
||||
# Stats registry for easy lookup
|
||||
STATS_TYPES = {
|
||||
'overview': OverviewStats,
|
||||
'ptp': PTPStats,
|
||||
}
|
||||
Reference in New Issue
Block a user