R&D/hadoop-dna

nDPI

sunshout 2015. 1. 7. 09:32

N/W 분석 필요성

- The characterization of network protocols is required not only for creating accurate network traffic reports, but increasingly, for overall network security needs.


DPI의 필요성, L4 분석으로는 모자람

- Today, traffic inspection facilities are available on every modern network security device, because the binding port/protocol scheme no longer holds


통계적 방법의 한계가 있음

- This difficulty triggered the development of statistical analysis based approaches often based on Machine-Learning Algorithms instead of direct payload inspection. Although some authors claim these algorithms provide high detection accuracy, real-life tests demonstrated that:

 . such protocols are able to classify only a few traffic categories (an order of magnitude less than DPI libraries) and thus less suitable for fine protocol granularity detection applications

 . Some tests show a significant rate of inaccuracy suggesting that such methods may be useful in passive traffic analysis, but unlikely to be used for mission critical applications, such as traffic blocking.


Commercial DPI 의 문제점

- Commercial DPI libraries are very expensive both in terms of one-time license fee and yearly maintenance costs. Sometimes their price is set based on yearly customers revenues, rather than on a fixed per-license fee, thus further complicating the price scheme.

- Closed-source DPI toolkits are often not extensible by end-users. This means that developers willing to add new/custom protocols support need to request these changes to the toolkits manufacturer. In essence, users are therefore at the mercy of DPI library vendors in terms of cost and schedule.

- Open-source tools cannot incorporate commercial DPI libraries as they are subject to a Non-Disclosure Aggrement (NDA) that makes them unsuitable to be mixed with open-source software and included into the operating system kernel.


nDPI 의 목적

- High-reliability protocol detection for inline, per application, protocol policy enforcement.

- Library extensibility is needed for new protcols and runtime in sub-protocolsdefinition. This feature is required because new protocols appear from time to time or evolve (e.g., the Skype protocol changed significantly since after the MS acquisition). Permanent library maintenance is, therefore, required.

- Ability to integrate under an open-source license for use by existing open-source application and embedding into an operating system's kernel. As already discussed, full source code availability is essential to safeguard privacy.

- Extraction of basic network metrics (e.g., network and application latency) and metadata (e.g., DNS query/response) that can be used within monitoring applications thus avoiding duplicte packet decoding, once in the DPI library and also in the monitoring application.


Design & Implementation



How # of packet needed for detect?

- SNMP, DNS, NetFlow 등의 UDP packet 같은 경우, 1개의 패킷 분석으로 가능

- BitTorrent 와 같은 경우, 8개의 패킷 정도를 분석해야 detect 가능

-> 최대 8개의 패킷을 분석함


Encrypted Traffic

- SSL 통신의 경우, data 자체를  분석할 수는 없음

- SSL 통신의 맺기 위한 initial 단계에서의 meta 정보를 분석 (ex.host 정보: api.twitter.com)


Pattern Matching

- Aho-Corasick algorithm 을 구현한 Multifast library를 사용함

- 기존 nDPI은 aho-coraick algorithm 같은 것을 사용하고 있지 않음 (if else 기반 binary 비교)