信息来源:
http://www.gray-world.net
================================================================================
This paper was originally released at Hitchiker's World Issue #10 (have a look
at
http://www.infosecwriters.com/hhworld/).
================================================================================
================================================================================
Copyright (c) 2005, Gray World Team <team [at] gray-world.net>.
Permission is granted to copy, distribute and/or modify this document under the
terms of the GNU Free Documentation License, Version 1.2 or any later version
published by the Free Software Foundation; with the Invariant Sections being
LIST THEIR TITLES, with the Front-Cover Texts being LIST, and with the
Back-Cover Texts being LIST.
You should have received a copy of the license with this document and it should
be present in the fdl.txt file.
If you did not receive this file or if you don't think this fdl.txt license is
correct, have a look at the official
http://www.fsf.org/licenses/fdl.txt
licence file.
================================================================================
=======
SUMMARY
=======
INTRODUCTION
1. COVERT CHANNELS THEORETICAL CONCEPTS
1.1 Academic definition
1.2 Covert channels types
1.3 Covert channels parameters
1.4 Covert channels and steganography
1.5 Network covert channels
2. NETWORK COVERT CHANNELS IN A GRAY WORLD: WHY AND WHO ?
3. COMMUNICATION CONCEPTS FOR NETWORK COVERT CHANNELS IMPLEMENTATIONS
3.1 Control / Data Channels
3.2 Multiplexing / Demultiplexing
3.3 Communication architectures
3.4 Communication models
4. PLAYING THE GAME VS THE DETECTION TEAM
4.1 Confusing the analyst with multiple sources and destination
4.2 Keeping a low profile
4.3 Live but let die
5. PRACTICAL IMPLEMENTATIONS
6. PRACTICAL IMPLEMENTATIONS: ACTIVE PORT FORWARDER
7. PRACTICAL IMPLEMENTATIONS: SKEEVE
8. PRACTICAL IMPLEMENTATIONS: MSNSHELL
9. PRACTICAL IMPLEMENTATIONS: SOCKSTER & TRAPSTER
WEBOGRAPHIE
================================================================================
============
INTRODUCTION
============
"Soon her eye fell on a little glass box that was lying under the table: she
opened it, and found in it a very small cake, on which the words "EAT ME" were
beautifully marked in currants. "Well, I'll eat it," said Alice , "and if it
makes me grow larger, I can reach the key; and if it makes me grow smaller, I
can creep under the door: so either way I'll get into the garden, and I don't
care which happens !" - Lewis Carroll, Alice In Wonderland.
We started co-writing a short paper about network covert channels and finally
you read that one. Parts 1 to 4 are concepts, ideas, food for the mind and next
parts describe toys we published because mind has to play sometimes. Enjoy.
================================================================================
=======================================
1. COVERT CHANNELS THEORETICAL CONCEPTS
=======================================
A covert channel is a communication channel that is not designed and/nor
intended to exist and that can be used to transfer information in a manner that
violates the existing security policy. It can be characterized with several
parameters such as noise, bandwidth and stealthiness.
However, the concept of "covert channel" is difficult to define precisely and
often leads to pedantic discussions about the meaning of "covert" (is it covert,
subliminal, hidden, stealth or do_we_really_care_at_all ?). So let's review
various academic definitions and concepts so that the reader becomes familiar
with the covert channels theoretical concepts and decides himself.
1.1 Academic definition
-----------------------
The covert channel concept first seems to appear [NCSC_1993] in [Lampson1973]
"a communication channel is covert if it is neither designed nor intended to
transfer information at all.".
A most common definition states that "a covert channel is a communication
channel that allows a process to transfer information in a manner that violates
the system's security policy". [DOD_1985], [NCSC_1993], [Rowland_1996].
//------------------------------------------------------------------------\\
[NCSC_1993] also gives specific definitions such as:
Definition 2 - A communication channel is covert (e.g., indirect) if it is
based on "transmission by storage into variables that describe resource
states."
Definition 3 - Covert channels "will be defined as those channels that
are a result of resource allocation policies and resource management
implementation."
Definition 4 - Covert channels are those that "use entities not normally
viewed as data objects to transfer information from one subject to
another."
and explains that "given that discretionary models cannot prevent the
release of sensitive information through legitimate program activity, it is
not meaningful to consider how these programs might release information
illicitly by using covert channels." and states that "The dependency of
covert channels on the (nondiscretionary) security policy models does not
imply one can eliminate covert channels merely by changing the policy model.
Certain covert channels will exist regardless of the type of
nondiscretionary access control policy used."
And for a funny note, look for the "hopefully all" string in [NCSC_1993].
\\------------------------------------------------------------------------//
The [WIKIPEDIA_CC] definition tells that a covert channel "is a communication
channel that does a writing-between-the-lines form of communication [... and] is
parasitic to its host channel; it reduces bandwidth of the host channel by
reducing the signal-to-noise ration in the host channel. [...] A covert channel
could be defined as a communication channel that transfers some kind of
information using a method originally not intended to transfer this kind of
information. [And finally] The term is used in the TCSEC specifically to refer
to ways of transferring information from a higher classification compartment to
a lower classification."
1.2 Covert channels types
-------------------------
[DOD_1985] presents two types of covert channels: covert storage channels and
covert timing channels. Moreover, it introduces the notion of measuring the
threat level of a covert channel by looking for covert channels mechanisms with
bandwidth that may exceed a certain amount of bits per second.
//------------------------------------------------------------------------\\
"Covert storage channels include all vehicles that would allow the direct or
indirect writing of a storage location by one process and the direct or
indirect reading of it by another." [DOD_1985]
"Covert timing channels include all vehicles that would allow one process to
signal information to another process by modulating its own use of system
resources in such a way that the change in response time observed by the
second process would provide information." [DOD_1985]
And for a funny note, [NCSC_1993] states that "In practice, when covert
channel scenarios of use are constructed, a distinction between covert
storage and timing channels [...] is made even though theoretically no
fundamental distinction exists between them. [...] In this guide, we retain
the distinction between storage and timing channels exclusively for
consistency with the TCSEC."
[CC_Here2Stay_1994] describes that "a storage channel is a covert channel
where the output alphabet consists of different responses all taking the
same time to be transmitted. A timing channel is a covert channel where the
output alphabet is made up of different time values corresponding to the
same response. A mixed channel is a combination of the two."
\\------------------------------------------------------------------------//
1.3 Covert channels parameters
------------------------------
[NCSC_1993] introduces various distinct parameters to characterize covert
channels: Noise, Bandwidth/Capacity, Synchronization and Aggregation.
"A covert channel is noiseless if any bit transmitted by a sender is decoded
correctly by the receiver with probability 1." [NCSC_1993]
Bandwidth is used "to denote the rate at which information is transmitted
through a channel. "In a covert channel context, bandwidth is given in
bits/second" and "is also related to the notion of 'capacity' [..] maximum
possible error-free information rate in bits per second."
Note that these two parameters are linked because "error-correcting codes help
change a noisy channel into a noiseless one" but "the resulting channel will
have a lower bandwidth than the similar noise-free channel". [NCSC_1993]
//------------------------------------------------------------------------\\
Also take care because "the bandwidth is a separate characteristic of a
continuous channel and the capacity is in fact a function of the bandwidth !
We should not reinvent the wheel and use the standard terminology that
already exists." [CC_Here2Stay_1994]. Ok ok, no problem :)
Another concept interesting to mention (for history ?) is the Small Message
Criterion concept stating that "If one has a very sensitive but short
message then the capacity is not a sufficient measure of security. [...]
When a covert channel exists in a system, the SMC will give guidelines for
what will be tolerated in terms of covertly leaking a short covert message
of length n bits in time 'tau' with fidelity of transmission p%. The SMC
must be used in conjunction with capacity for a full security
analysis/validation of a system". [CC_Here2Stay_1994]
\\------------------------------------------------------------------------//
The synchronization relationship between each part of the communication
channel allows one part to notify the other part that it has completed reading
or writing data. If sender and receiver exchange synchronization messages in
both directions, synchronization and data messages may be indistinguishable.
[NCSC_1993]
Messages sent or received by the parts of the communication channel may use
multiple data variables that can be used as groups to amortize the cost of
synchronization. Communication "channels may thus be aggregated serially, in
parallel, or in combinations of serial and parallel aggregation to yield optimal
(maximum) bandwidth." [NCSC_1993]
Other un-academic (?) parameters may be used to characterize covert channels:
these are the latency and stealthiness parameters.
The Webster's Revised Unabridged Dictionary (1913) states "stealthiness - the
state, quality, or character of being stealthy; stealth." and the Free On-line
Dictionary of Computing states "latency - the time it takes to a {packet} to
cross a network connection, from sender to receiver." [2]
Latency and Stealthiness obviously depend on the previous parameters and
adjusting their level seems more empiric (how achieve the best latency -
stealthiness trade-off) than practical (latency level: x%, stealthiness level:
y%, luck level: z%).
1.4 Covert channels and steganography
-------------------------------------
[InternetSteg_ActWard_2002] "[...] describes a subliminal channel as one
where hidden data piggybacks on an innocuous-looking legitimate communication.
By definition, steganographic carriers are subliminal channels since the
communication appears to be innocent but really has ulterior information
embedded below the threshold of perception".
//------------------------------------------------------------------------\\
[InternetSteg_ActWard_2002] also presents models and concepts for restricted
environments to use active wardens in order to block the creation of
subliminal channels allowing relatively high-bandwidth leakage of
information. It introduces a concept named "Minimal Requisite Fidelity"
(MRF) that "defines the degree of signal fidelity that is both acceptable
to end users and destructive to covert communications" and classifies two
kind of carriers: unstructured and structured (respectively for images,
audio or anything needing human interpretation and any "well-defined syntax
and semantics" instances).
It states that "while there are several techniques currently in use that
reactively attempt to detect steganography in images, this is
understandably an impossible task to complete" and tells that for
"unstructured carriers, the limits to what can be changed [in order to
remove opportunity to build covert channels] are defined by fuzzy notions
such as perception".
It also tells that digital watermarking emphasises sometimes more on
robustness than on secrecy and that a watermark would be thus proportionally
easier to detect. So good, isn't it ? the warden will detect a watermark but
will it try to detect that this watermark itself is a carrier ?
\\------------------------------------------------------------------------//
[Embed_CC_TCPIP_2005] resumes that "steganography can only be prevented by
detection, not by attempting to remove any hidden information [...]" (" passive
warden threat model") because it will cost too much resource for a warden to
be active in many scenarios. Warden may be active for some low level OSI layers
but will be hardly convenient for high level OSI layers.
1.5 Network covert channels
---------------------------
For a little bit of history, [NCSC_1987] extends [DOD_1985] to "trusted
network systems and components" and cites Padlipsky (1978) and Girling (1987)
for literature references.
Resources for this topic are available in [Rowland_1996] which "details
various weaknesses in the TCP/IP protocol suite" that "allow an attacker to
leverage techniques in the form of covert channels to surreptitiously pass data
in otherwise benign packets.". [PractDH_2002] discusses methods to hide data in
the TCP/IP protocol suite and the [CC_TCPIP_Hdr_2002] presentation deals with
covert channels related to TCP and IP Headers.
[Embed_CC_TCPIP_2005] "study a number of previously proposed schemes for
embedding data within the TCP and IP protocol headers, thus creating a
steganographic covert channel". The paper "show how the use of these schemes can
easily be detected by a passive warden" and propose an "alternative method for
embedding data" inside the ISN TCP/IP header field so that "a passive warden
cannot detect the use of this method without knowledge of a secret key, subject
to some realistic constraints."
================================================================================
=========================================================
2. NETWORK COVERT CHANNELS IN A GRAY WORLD: WHY AND WHO ?
=========================================================
The nature of a covert channel obviously depends on what it's to be used for.
For example, it wouldn't be so okay to use an high bandwidth-latency covert
channel if we only need to send 1024 bytes in the next 24 hours, isn't it ? In
other words, before selecting any communication channel for covert traffic, we
need to review our goals, look for the available communication channels and
decide which ones offer the best trade-off in term of bandwidth/latency/
stealthiness.
Reaching efficient trade-off while designing how to embed our channel
implies to answer what we want to do (i.e. who are we and why do we want to
achieve this goal) and how we want to do it.
Answering the "who ?" question isn't obvious as it would seem at first. There
may be as many different 'who' as people willing to design their own covert
channel implementation. You may be a legitimate insider who want to access
external services without bothering with your local area security policy (for a
short period or longer), you may be some not so legitimate person (thanks for
patching my systems so that you're the only one with access ;)) who wants to
keep a stealthy remote access to one or several corporate or personal systems,
you may be some fully legitimate person willing to implement stealth
communication methods between systems or you can be anyone who is tainted and
not so sharp about definitions.
Do we need to answer the "why ?" ? You may want to tunnel some protocol, to
continuously download so very great amount of data (db snapshots ? ;)), to
protect legitimate communication streams (honeynet ? or "Mr director, I admit
this system wasn't part of our honeynet but still it was good idea to set that
communication channel"). You may also answer that "why" by figuring out some
"what" ideas you can think about (we didn't say play, did we ?).
//------------------------------------------------------------------------\\
Stealth Commander
Suppose we already compromised a host within a remote network. Depending on
our goals, we may want to use uni-directional communication channel or
bi-directional communication channels. We can use a source -> destination
uni-directional channel if we only need to send commands to the compromised
host (are you Up, Attack, Stop attack). We can use destination -> source
uni-directional channel if we only need to get data from the compromised
host (as getting sniffer traces each day). Or we can finally decide we want
full control over the compromised host and then use a bi-directional
channel.
For a concrete example about Uni-directional 'Are You Up' covert channels,
check the [WLAN_STEALTH_2005] paper that presents an application of the
port knocking concept for WLAN environments.
Depending on these aspects, we can choose the most adapted communication
channel to carry our covert channel and focus on the best trade-off for the
bandwidth, latency and stealthiness parameters.
Battlefield preparation
Suppose we already compromised a host within a remote network and that the
host is now compromising its neighbours. We don't really need to use any
communication channel while the battlefield is automatically prepared
(remember all these worms crawling the Internet ?). Once this battlefield is
ready, we directly benefit from a multiple sources advantage - that
advantage being usable between the compromised hosts themselves, between
each compromised host and the operator and between the set of compromised
hosts and the operator.
Enrolling unwitting soldiers
[Unwitting_2003] describes a solution to enrol unwitting end users for
covert channels communications. Each time someone browses an Internet
website, the remote server can use various fields of the HTTP protocol to
carry specific information that the user will forward to another server
without knowing it. The presented model states to provide unobservability
(i.e. "that an observer cannot tell if messages are being sent or received
at all"). Remember all these vulnerabilities in client-side applications and
all these "bad" remote servers on the wild W3 ? Seems like the visible part
of the iceberg isn't it ?
Automated exit
Suppose we compromised an host within a remote network. It is quite
difficult to know what will be the best communication channels to use. So
let our host learn about its environment and then decide itself which
covert channels to use.
\\------------------------------------------------------------------------//
The IT world is evolving day after day and whatever one can say, it is
possible to [build|rent|buy] wide scale networks of resources these days.
Preserving a relative anonymity being quite easy too, the main problem is to
define the communication methods that will link the network components
between each other.
The usual definition of a covert channel as it is exposed in 1.1 states that
"a covert channel is a communication channel that allows a process to transfer
information in a manner that violates the system's security policy". Trying to
adapt that academic definition is sometimes complicated. One can ask, for
example, "what's the system's security policy for an international botnet ?"
<sarcasm>Being pedantic, maybe that building stealth communication channels
to link several botnet is not building covert channels because no communication
channels exists ? Because there's no system's security policy to deal with ?
Maybe that network covert channels didn't exist between 1973 and 1978 and
perhaps should we thank NSCS to have extended DOD TCS criteria related to CC to
the network in 1987 because no one was able to discuss the covert channel
network concepts from the 1985 definitions.</sarcasm>
You know what ? It'd seem perfect because if we can use stealth channels (ok,
we admit it, they're not covert channels..) it means that academic research
about covert channels detection will not bother us, will it ?
But wait a minute, the botnet owner has his own security policy and his own
communication channels, won't he ?
If the owner gets caught because of [wiki your favorite version: storage,
DDOS, CC, ISN, suggestion ?], would the communication channels be analyzed for
covert channels ? After all, covert channels are not about technical means but
about information and about what that information means... Always funnier to
look at the puppet master, no ? ;)
================================================================================
=====================================================================
3. COMMUNICATION CONCEPTS FOR NETWORK COVERT CHANNELS IMPLEMENTATIONS
=====================================================================
So, if covert channels are communication channels that are not designed nor
intended to exist, then we must design a way to embed our communication
streams inside authorized channels. The main question being how to do this.
Covert channels may be based on merely all existing protocols from OSI low
layers ones as IP, TCP, UDP, ICMP to OSI high layers ones as HTTP, SMTP, etc.
However, we only can use protocols authorized by the NACS and we first have to
decide about the trade-off we will accept regarding reliability and
stealthiness.
3.1 Control / Data Channels
---------------------------
There is no academic definition to what has to be a control channel. We may
state that control channels carry the information required to handle the data
flows from one point to another: establishing communication flows and keeping
them up while taking care of bandwidth, latency and stealthiness parameters.
The control operations themselves are relatively short amounts of information
such as: open/close the data channel, increase/decrease bandwidth on specific
channel, interrupt data communication, switch to another control channel type,
etc. Specific initialization handshake procedures may sometimes be sufficient
to handle data channels over a certain amount of time without having to send
information over the control channel. These procedures may include various
parameters such as the type of compression/ciphering per data channel or
advanced parameters such as (de)multiplexing on the run or what to do when
bandwidth/latency/stealthiness parameters reach a certain threshold.
The control channel(s) may be based on unilateral tunnel(s) (only sending
packets from outside to inside or vice versa) or be based on more sophisticated
configurations, use asynchronous methods and various distinct protocols, use
high sleeping delays and environment learning methods, etc.
As the control channel(s) have to be as stealthy as possible, any unusual
activity regarding the standard behaviour (permanent sessions, generation of
lots of huge amounts of data/packets, etc.) should be avoided. And because
we'll never send large amount of data through the control channel, we may
forget about its bandwidth parameter and focus on the stealthiness and
latency ones.
--
The data channels are reliable communication channels that can be used to
transfer information from one side to another. The design of the data channels
may also focus on stealthiness if the bandwidth requirements are not high but
if we know these channels will carry huge data traffic peaks or flows, we know
we will have to choose between a less or more long time of transfer and the
risk of being detected.
3.2 Multiplexing / Demultiplexing
---------------------------------
We know that covert channels are based on communication channels that are
legitimate for the NACS. Therefore, it is (quite) often possible to use several
communication channels types to carry our covert channels simultaneously (each
channel type having its own requirements in term of bandwidth/latency/
stealthiness). We presented this notion as "Demultiplexing" in [2] while it is
presented as "Aggregation" in [NCSC_1993] (do we really care to name this
concept potato or potado ? after all, we're discussing hiding topics, no ? ;)).
//------------------------------------------------------------------------\\
local network |NACS| Internet
||
------------CC_type1------------
/ || \
Application /-------------CC_type2-------------\ Application
\ / || \ /
Client <------------Data Channel------------> Server
\ || /
\-------------CC_type3-------------/
||
\\------------------------------------------------------------------------//
[2]
For example, several communication channels over HTTP, ICMP and SMTP
protocols may be used simultaneously in order to improve the stealthiness of
communication control channel (the communication methods may change from time
to time, randomly, after a environment learning period, etc.).
--
But sometimes, multiple connection channels may puzzle NACS administrators.
In this situation, we consider multiplexing several Data and/or Control
channels over a single communication channel. Note that doing so has to be
considered case per case. We can lower the latency of our channels if we use an
high bandwidth channel but we would need to increase the "permanent" parameter
if we multiplex control and data channels or would risk to be un-covered
because of an abnormally high bandwidth usage.
//------------------------------------------------------------------------\\
local network |NACS| Internet
||
Appl. 1 || Appl. 1
\ || /
Client <-------Multiplexed Channel--------> Server
/ || | \
Appl. 2 || | Appl. 2
|| \
|| Appl. 3
||
\\------------------------------------------------------------------------//
[2]
3.3 Communication architectures
-------------------------------
Covert channels communication architectures have not to be thought of as
usual ones because their primary goal is to remain covered long enough so that
we accomplish our task. To do so, we may use standard client/server and peer to
peer architectures or design multiple levels architectures (with levels of
legitimate/noising/unexisting/etc intermediaries and/or destination/sources -
see 3. PLAYING THE GAME VS THE DETECTION TEAM).
Client/Server architecture separates clients from servers. Their setup and
operating modes are quite different. Generally, the client(s) sends requests to
the server(s) which computes and returns a result. Such architecture is
scalable as long as the client(s) (that is: a user, a process, a station, a set
of stations) has access to at least one server.
In a P2P architecture, all participants operate as client and server whereas
these two modes are used simultaneously, one after another, only after
synchronization or after/before a specific event. Such architecture let us
design covert channels communications modes between more than just two parts
without having to bother with dedicated servers.
Participants to the architecture (shall it be client/server or p2p) may, of
course, only be able to know about the way to learn how to contact other
participants. The external communication core may thus be "transient" and
dynamically change its topology and access points.
3.4 Communication models
------------------------
Various communication models may be used for covert channels. The simple
model is based on a single Point-to-Point connection. This "Direct" model
assumes that a server component is running on the external network and that the
client component opens a communication channel through the NACS.
//------------------------------------------------------------------------\\
1 2
CC Client <-------> NACS <----------> CC Server
<___internal_network___> Internet <___external_networks___>
\\------------------------------------------------------------------------//
Implementation of this model is simple but possibilities are quite
restricted. Server and client only can execute what they were designed for. It
is, however, sometimes quite enough (see [AckCmd] for an example of remote
shell).
--
In order to use the covert channel for different types of data, it is
possible to use a "Proxy" model. Proxy components accept data streams from
clients and servers and act as intermediaries without caring about the kind of
data they transmit. This model allows to deal with various distinct applications
while using one (or multiple) types of covert channels:
//------------------------------------------------------------------------\\
SSH client 1 2 2 3 SSH server
Web browser<---> CC Client <------ NACS -----> CC proxy <-----> HTTP Daemon
IM client IM login
<__________internal_network___________> Internet <___external_networks___>
\\------------------------------------------------------------------------//
This "Proxy" model is the most popular and almost each program implementing
tunnels or covert channels use that scheme.
--
But sometimes, previous models are not giving satisfaction because we want the
NACS protected services to open the communication channels themselves. This
"Reverse communication" model implies that the servers components themselves
initiate the communication channels from the protected network to the external
one and then wait for requests. This model applies to the client/server model:
//------------------------------------------------------------------------\\
2 1
CC Client <------------------> NACS <---------> Reverse CC Server
<_external_networks_> Internet <________internal_network________>
\\------------------------------------------------------------------------//
and also applies to the proxy model:
//------------------------------------------------------------------------\\
2 3 1 1 4
SSH Client SSH server
Web browser<->CC_Client<->CC_server<->NACS<--Reverse_CC_proxy-->HTTP daemon
IM client IM login
<____external_networks____> Internet <__________internal_network__________>
\\------------------------------------------------------------------------//
================================================================================
=========================================
4. PLAYING THE GAME VS THE DETECTION TEAM
=========================================
The communication channel is up and ready to be used to forward data streams.
We may now focus on playing the game vs the detection team and their hability
to detect and/or interrupt our communication.
The rules of engament for the detection team are theoretically quite simple.
They may try to detect exceeded specific thresholds in the network or transport
layers (see [tcpstatflow]) or detect specific signatures for the tools used to
build the communication channel (see [any signature-based IDS]). They may try
to detect protocol anomalies generated by the tools (see [WebTap]) or try to
learn the "network behaviour" and then use statistical methods to determine if
the observed data streams look less or more suspicious (see [WebTap], [3]
implementation]).
The detection team may also use some Network Security Monitoring (NSM) Model
along with standard IDS technologies. [Integ_NSM] and [NetSec_OpenSrc] describe
NSM which "involves to collect, analyze and increase indications and warnings
to detect and respond to intrusions". [NetSec_OpenSrc] states that "Within the
context of NSM indicators are outputs from products which are created by IDS
and are usually referred to as alerts. Trained people who may be referred to as
analysts should be engaged in interpreting intrusions. The interpretation of
indicators results in warnings. Warnings are human conclusions which indicate
to decision makers that a network may have been compromised."
Basically, the detection team will try to automatically detect anomalies they
can conceive and model or they will try to store "enough" data information for
an human being to seek and detect suspicious activities.
--
Understanding the detection team needs rules is the key of that game. They
need to limit the false positive (that is alerting for suspicious data
streams while there is nothing but legitimate traffic) and they need to keep a
low ratio of false negative (not raising alerts for suspicious traffic). So if
the detection team has no rule to detect our communication channel or has no
way to set up the related rule (because it would be too expensive in term of
system resources, money, false-positive,...), then our communication channel
may be safe.
Raising the difficulty to detect we are using a communication channel to
embed our data streams is possible with several methods. The first strategy
is to confuse the analyst with multiple sources and destinations. A second type
of method lays on learning and/or using the environment behaviour in order to
keep a standard profile. And at last, suppose that we don't care at all if
someone detects something suspicious ? ;)
//------------------------------------------------------------------------\\
"I am getting rather tired of "everything over port 80" and calling
everything a firewall this or firewall that. Getting into a world where you
have a so called "firewall" for every type of service that goes over port
80 or you have to somehow try and manage to block it in your proxy while
still trying to allow the rest is insane." [80_insane]
\\------------------------------------------------------------------------//
4.1 Confusing the analyst with multiple sources and destination
---------------------------------------------------------------
Several distinct models to multiply the number of destinations are presented
in [1]. Some involve using multiple transit servers that will accept packets of
data before sending them to the final destinations while another one introduces
the notion of sending traffic to legitimate destinations each time the
communication channel has to be used. A last model presents a solution to use
legitimate third party components in order to store and retrieve information
(see [ErrnoJones] for an application of this model to the HTTP protocol and
[DNSCC_UnpubPhrack] for an application to the DNS protocol).
It is also possible to multiply the number of sources. Using the same models,
the distinct sources can be used alternatively when sending data through the
NACS and/or some of them may only send legitimate data in order to fool or
increase the volume of data the detection engine will have to inspect or store
for further analysis.
It is finally possible to use unexisting sources and/or destinations if the
recipient is known to be on the path or if the goal is to increase the
confusing amount of traffic or even to consider that the destinations
themselves are representative of the information to transmit (the more obvious
scheme being that a destination represents a bit of information).
4.2 Keeping a low profile
-------------------------
This model is presented in [2] as a method to learn what is the communication
channel to use regarding its environment. As we know the detection team is
playing the behavioral learning game (see [WebTap], [3]), there is no reason
for us not to play the same game. The right communication channel may thus be
based on various distinct protocols that will be used in function of what the
source is authorized to do.
Another way to keep a low profile is not to send any superfluous traffic at
all and only surreptitiously alterate part of legitimate data streams in order
to send and receive information (see [PassiveCClinux]). This solution is, of
course, more interesting if you own the intermediary devices between clients
and servers so that your proprietary binary client code can send you what it is
not supposed to send at all (example ? famous mail synchronisation from
everywhere you are with your pda ;)). This concept is a well-known concept in
the cryptography research area: [RSA_CC], [BH_BkDoor_2005].
4.3 Live but let die
--------------------
This concept is presented in [2]. It is based on the fact that the covert
channel may be built upon control and data channels, each type of channel
having distinct needs in term of bandwidth, latency and stealthiness. As the
data channel needs bandwidth and latency parameters that may be higher than
specific detection thresholds, it is quite feasible to think about a solution
that will try its best to cover and keep the control channel alive (even if it
means keeping it silent and waiting for better times) while not caring at all
about loosing data channels. Another application is that there is no problem
that someone detects the communication channel since it's too late (A keylogger
sending passwords, see [IcmpKeylog], for example).
//------------------------------------------------------------------------\\
Note that [NCSC_1993] states that "Transient covert channels are those
which transfer a fixed amount of data and then cease to exist. Normally,
bandwidth and capacity calculations apply only to channels that are
sustainable indefinitely. Thus, it would seem transient channels are an
irrelevant threat."
\\------------------------------------------------------------------------//
================================================================================
============================
5. PRACTICAL IMPLEMENTATIONS
============================
Designing your covert channel implementation is the most interesting moment.
Just free your mind before starting 'cause now you may do anything you want and
nobody will stop you (and hopefully not the NACS ;)).
You may use various protocols types like IP, ICMP, TCP, HTTP, IRC, DNS, RTSP,
etc. and various program types like full client/server or p2p models, kernel
modules, CGI programs, self-replicating applications, injectors to running
processes or installed applications, etc.
And may you ask the "Why should I implement anything ?", we may quote J.B.
who told so rightfully about rootkits detection: we "had already defeated this
detection mechanism before your released [the detection engine]. See I knew you
or someone was going to do this [...] It is kinda non-climactic to create
"solutions" for problems that don't exist yet [...]" [Rootkits_discussion].
================================================================================
==========================================================
6. PRACTICAL IMPLEMENTATIONS: ACTIVE PORT FORWARDER [apf]
==========================================================
6.1 Description
---------------
Active port forwarder is a software tool, which implements several reverse
tunneling techniques (RTT). It is designed for people without an external IP who
want to make some services available on the Internet.
The application is divided into two parts: afserver is placed on the machine
with a publicly accessible address, and afclient is placed on the machine
behind a firewall or masquerade.
When the tunnel between two APF parts is established, all the connections
received by the afserver are forwarded via the afclient to the proper
destination. The whole communication is secured by the use of SSL. The bigger
chunks of data are compressed with the help of Zlib.
However, APF is not intended to hide it's presence. The priority is to
achieve high bandwidth and reasonably small latency. Moreover, users are not
being starved, but the bandwidth is quite fairly distributed between them.
6.2 Implemented Techniques
--------------------------
6.2.1 Direct tcp connections
Direct tcp connection is used to create permanent data/control channel, which
with flow control/packet buffering provides good performance and reasonably
small latency. This type of the channel is rather easily detectable, because
long-time connections are the seldom ones.
Suppose we want to make our sshd server publicly available and the default
behaviour of APF satisfy us. The whole procedure is very simply.
On the remote host we have to type:
//------------------------------------------------------------------------\\
user@remotehost> afserver
\\------------------------------------------------------------------------//
And on the local machine:
//------------------------------------------------------------------------\\
user@localmachine> afclient -n remotehost -p 22
\\------------------------------------------------------------------------//
After this, all the connections to remotehost:50127 will be forwarded to
localmachine:22.
6.2.2 HTTP/HTTPS proxies
When we can't use direct tcp connections due to the local network security
policy, we can try to omit the limitations by the use of HTTP/HTTPS proxies.
Active port forwarder can encapsulate the messages into valid proxy queries
and HTTP server answers. Moreover, afserver waiting for HTTP packets can still
accept direct tcp connections.
Suppose we want to make our sshd server publicly available with the use of
HTTP PROXY (located on httpproxy:8080). The default behaviour of APF once
again satisfy us. The whole procedure is only slightly more complicated.
On the remote host we have to type:
//------------------------------------------------------------------------\\
user@remotehost> afserver -P
\\------------------------------------------------------------------------//
And on the local machine:
//------------------------------------------------------------------------\\
user@localmachine> afclient -n remotehost -p 22 -P httpproxy
\\------------------------------------------------------------------------//
After this all the connections to remotehost:50127 will be forwarded to
localmachine:22.
================================================================================
==============================================
7. PRACTICAL IMPLEMENTATIONS: SKEEVE [skeeve]
==============================================
7.1 Description
---------------
Skeeve is a software tool, that can easily create an ICMP tunnel between two
computers, which may be located in different networks and separated by a
firewall. It creates an ICMP tunnel which is based on the use of a Bounce
server (The method relies upon the basic IP address spoofing methodology).
7.2 Implemented techniques
--------------------------
Skeeve Client accepts TCP connections and works as a converter for the IP
header (changing protocol flag from TCP to ICMP echo_request|reply and making
some other slight modifications). Skeeve Server is doing the reverse procedure
and restores the original IP header settings. Both parts are implemented in one
'C' program as a Loadable Kernel module.
The same scheme will be used for reverse data. Only few conditions:
- Bounce Server must be able to communicate with both the client and server
- Bounce Server has to accept IP packets with spoofed IP address
- Bounce Server has to accept ICMP echo_request|reply packets
//------------------------------------------------------------------------\\
TCP Client(s) TCP Server(s)
| (1) | (2) |
+--------------+ ------> +----------------+ | -----> +----------------+
|Skeeve Client | | Bounce Server | | | Skeeve Server |
+--------------+ <------ +----------------+ | <----- +----------------+
Internal network DMZ | External network
NACS
1) Client sends:
IP_SRC - IP of Skeeve Server (spoofed address)
IP_DEST - IP of Bounce Server
ICMP->ECHO_REQUEST
2) Bounce Server catches the ICMP ECHO_REQUEST message and answer with:
IP_SRC - IP of Bounce Server
IP_DEST - IP of Skeeve Server
ICMP->ECHO_REPLY
\\------------------------------------------------------------------------//
7.3 Usage
---------
Skeeve is easy to use. For example, we want to get access to the external
WWW server:
first at all, we need to define some parameters in skeeve.c file:
//------------------------------------------------------------------------\\
...
#define PORT 80
#define CLIENT_IP "192.168.1.55"
#define BOUNCE_IP "192.168.1.1"
#define TARGET_IP "192.168.1.251"
...
PORT - which port we will listen to
CLIENT_IP - it's our ip
BOUNCE_IP - IP of Bounce Server
TARGET_IP - Set IP of our server.
\\------------------------------------------------------------------------//
When parameters are set we compile Skeeve as a lKM:
//------------------------------------------------------------------------\\
gcc -c skeeve.c -I /usr/src/linux/include
\\------------------------------------------------------------------------//
then, we just load the module:
//------------------------------------------------------------------------\\
On the Client side: insmod skeeve.o type=client dev={eth0 | ...}
On the Server side: insmod skeeve.o type=server dev{eth0 | ... }
\\------------------------------------------------------------------------//
Look at the kernel messages in /var/log/messages and that's all. Then just
connect to the server on port 80 and do anything you want :)
================================================================================
==================================================
8. PRACTICAL IMPLEMENTATIONS: MSNSHELL [msnshell]
==================================================
8.1 Description
---------------
MsnShell is a covert channel tunneling tool. With it, you can remotely
control a Linux computer behind a firewall. It, consisting of an executable
file as the Msnshell server daemon, encapsulates shell command in MSN protocol.
Not only is MsnShell able to work with firewall, but also pierce a HTTP proxy.
Computers often are located behind firewalls which deny many malicious
connections. Therefore these computers are expected to be relatively safe from
external network. But Msn Messenger connection from internal network is usually
allowed and is made through a gateway or a http proxy which allows internal
computers to access internet via HTTP.
The MsnShell key features are:
1. Give a SSH/FTP from any box located in the internal network to an external
boxes;
2. Encapsulate SSH/FTP command or result in MSN protocol;
3. Can also work with a HTTP proxy;
4. Multiple access at a same time.
================================================================================
===================================================================
9. PRACTICAL IMPLEMENTATIONS: SOCKSTER & TRAPSTER [sock|trap/ster]
===================================================================
9.1 Description
---------------
Sockster and Trapster are two components of a tunneling framework which can be
used to bypass NACS or for building a pentester environment. In the current
implementation nearly all tcp based protocols like smtp, pop, vnc, rdp and ssh
can be tunneled by the system.
The system uses tunneling plugins for different connections so the
'administrator' can choose the best channel available for stealthiness or
throughput for each tunnel endpoint. Currently its possible to use http, ftp
and dns (udp) as tunnel protocols. Each tunneling plugin presents the same
functionality so the enduser will not notice if the connection is over a dns
tunnel at one time and over a ftp tunnel at a second connection. The tunneling
plugins care about encryption and stealthiness. Using this approach it is not
necessary for Trapster or Sockster to implement such functions. This is very
useful for pentesting because the engineer can create own plugins with or
without encryption. For creating detection engines this is also a big advantage
because it is possible to emulate other tunneling tools or normal unencrypted
traffic too.
The most important advantage over other tools is, that it is not necessary to
create a static configuration for each connection to different endpoints
through the tunnel because the system can act as a transparent proxy even for
protocols which are not designed for proxying. Other tools need a static
mapping of client ip:port to destination:port to build a tunnel. Mainly they
are only used by one internal endpoint (the pc the internal tunneling endpoint
runs on). With the framework presented here the internal tunneling endpoint
can be used by all clients in the local subnet. The client doesn't have to be
configured to use this 'proxy' because mainly this is not possible - ie for
pop3. Also in big environments this it not desired because of the man power
necessary for updates. To fullfill this functionality the tunneling framework
emulates a tcp stack and grabs all configured network traffic and send it over
the tunnel to the desired endpoint. The reverse traffic is then mapped
correctly so that the client application don't notice anything. This
functionality is implemented as a plugin too (called 'IO plugin') so the
'administrator' can build his own plugins for the framework. A Socks proxy
plugin and a http proxy plugin are under development currently. These plugins
will emulate a Socks and a http proxy so every http proxy ready application can
use the tunnel directly without the need of the generic IO plugin which cannot
be used for localhost currently. This kind of plugin is the entry point for one
side of the tunnel - mainly the side which is not reachable from another
network.
To get the current state of Sockster and Trapster it is possible to dump the
internal session structures as html via two cgi applications. With these CGI
the 'administrator' can look at the sessions currently active or finished and
get some additional information about the services itself.
9.2 Sockster
------------
In collaboration with Trapster Sockster build a protocol independent proxy.
Sockster is one part of the 'tunnel' and is located mainly on the 'free'
internet (network 'I') but can also run in a secured network (network 'A')
where direct connections from the network where Trapster runs (network 'B') is
denied for the desired protocols. One Sockster can be the 'free' tunneling
endpoint for more than one Trapster so many Trapster can use the 'service'. The
main application Sockster was designed for connections from network 'B' into
the network 'A' or 'I'.
This can be used for 'bad' things like reading and sending private mails or
using instant messaging from an internal network.
The second application for Sockster is to receive session requests for
connections from systems in network 'A' or 'I' to systems in network 'B'.
Sockster will send these requests to the correct Trapster and receive a
request from Trapster for a listening socket for the client application after
a few seconds if all goes well. The port number of the listener is sent to the
requester application (TunOpenConn.pl for example). The user can then connect
to the listening port with his client application. The Sockster will then
forward all data from the client socket to the associated Trapster. Using
this way the Client can access all ips and services on network 'A' from network
'B'.
This can be used for bypassing NACS and firewalls which normally prevent the
connection from an external ip. Its possible to make a port scan on an internal
ip with this functionality too.
The tunneling plugins for Sockster are normally daemons which communicate
with Sockster via shared memory, pipes or sockets. This allows to use many
different plugins at the same time and thus allows connections to the same
Sockster with different protocols at the same time.
Currently there is no IO plugin structure for Sockster but a cgi which can be
used for requesting connections from the 'B' to 'A'. This CGI will send the
request to Sockster and get the answers back from Sockster via IPC. This is a
very useful and fast way to request and establish a new connection to a system
inside a firewalled network.
9.3 Trapster
------------
Trapster is the other part necessary for the tunneling framework. Trapster
builds the tunneling endpoint in network 'B' and manages all connections in the
local network. To get the data from the local clients it uses IO plugins like
the libpcap/libnet plugin. The IO and tunneling plugins for Trapster are mainly
perl modules. They communicate with Trapster via direct requests without the
need for IPC. Currently its only possible to use one IO and one tunneling
plugin at the same time.
In correlation to Sockster, Trapster has two main functions: create sessions
and multiplex data from local clients into the tunnel and establish sessions
and proxy data from remote clients via Sockster. The second functionality is
that Sockster receives requests from Trapster for new connections from network
'A' or 'I' to 'B'. Sockster will then establish the connection to the desired
endpoint (internal ip:port) and create a session for the new tunneled
connection.
9.4 ?Framework?
---------------
Both central services are not designed to be stealthy and covert. This is the
task of the plugins. Sockster and Trapster will queue the data so that the
plugins can use some 'magic' sending algorithm to be stealthy or noisy. The
'administrator' has to choose the right plugin for the environment at this
time. But its also possible to write some new plugins which learn from the
local environment and choose the right algorithm themselves.
In summary this framework is dedicated to people willing to adapt a very
personal tunnel more than to people willing a production ready software with
graphical setup. In addition to the framework there are some tools for testing
purposes like a tcp stress tester where 'administrator' can test a network
link. The graphical application lets the 'administrator' choose what bandwidth
to use. With this tool it is very easy to check how well a tunneling solution
scales with more traffic and when the sending algorithm is not stealthy anymore
because of high load.
================================================================================
===========
WEBOGRAPHIE
===========
[1]:
http://gray-world.net/projects/papers/covert_paper.txt
[2]:
http://gray-world.net/projects/papers/rtt.txt
[3]:
http://tunnel.gotdns.org (Tdetect)
[apf]:
http://gray-world.net/pr_af.shtml
[skeeve]:
http://gray-world.net/poc_skeeve.shtml
[msnshell]:
http://gray-world.net/pr_msnshell.shtml
[sock|trap/ster]:
http://tunnel.gotdns.org
---
[80_insane]:
-
http://archives.neohapsis.com/ar ... e/2005-q3/0050.html
[AckCmd]:
-
http://gray-world.net/tools/ackcmd.zip
[BH_BkDoor_2005]: Building Robust Backdoors in Secret Symmetric Ciphers (2005)
- A.L. Young
-
http://www.blackhat.com/presenta ... 05-young-update.pdf
[CC_Here2Stay_1994]: Covert Channels - Here to Stay? (1994)
- Ira S. Moskowitz, Myong H. Kang
-
http://gray-world.net/papers/moskowitz94covert.pdf
[CC_TCPIP_Hdr_2002]: Covert Channels in TCP/IP Headers (2002)
- Drew Hintz
-
http://guh.nu/projects/cc/covertchan_files/frame.htm
[DNSCC_UnpubPhrack]: DNS Covert Channels and Bouncing Techniques (2005?)
- Anonymous
-
http://gray-world.net/board/index?PID=2192
[DOD_1985]: Departement of Defense Trusted Computer System evaluation criteria
5200.28-STD (1985)
- DoD standard
-
http://gray-world.net/papers/5200.28-STD.html
[Embed_CC_TCPIP_2005]: Embedding Covert Channels into TCP/IP (2005)
- S.J. Murdoch, S. Lewis
-
http://gray-world.net/papers/ih05coverttcp.pdf
[ErrnoJones]: Legitimate Sites as Covert Channels - An Extension to the
Concept of Reverse HTTP Tunnels (?)
- Errno Jones
-
http://www.gray-world.net/papers/lsacc.txt
[IcmpKeylog]: Remote Windows Kernel Exploitation - Step Into the Ring 0 (2005)
- Barnaby Jack
-
http://www.eeye.com/~data/publis ... OT20050205.FILE.pdf
[Integ_NSM]: Integrating the Network Security Monitoring Model (2004)
- Richard Bejtlich
-
http://www.taosecurity.com/sysadmin_apr_04.pdf
[InternetSteg_ActWard_2002]: Eliminating Steganography in Internet Traffic with
Active Wardens (2002)
- G. Fisky, M. Fisk, C. Papadopoulos, J. Neil
-
http://gray-world.net/papers/ih02.pdf
[Lampson_1973]: A Note on the Confinement Problem (1973)
- Butler W. Lampson
-
http://gray-world.net/papers/lampson73note.pdf
[NCSC_1987]: Extension to 5200.28-STD to trusted network systems and components.
(1987)
- National Computer Security Center
-
http://gray-world.net/papers/NCSC-TG-005.html.gz
[NCSC_1993]: A Guide to Understanding Covert Channel Analysis of Trusted
Systems (1993)
- National Computer Security Center
-
http://gray-world.net/papers/aguidetocc.txt
[NetSec_OpenSrc]: Network Security- An Open-Source Approach (2005)
- Blain R. Jones
-
http://www.infosecwriters.com/texts.php?op=display&id=321
[PassiveCClinux]: The Implementation of Passive Covert Channels in the Linux
Kernel (2004)
- Joanna Rutkowska
-
http://gray-world.net/papers/passive-covert-channels-linux.pdf
[PractDH_2002]: Practical Data Hiding in TCP/IP (2002)
- K. Ahsan, D. Kundur
-
http://gray-world.net/papers/acm02.pdf
[Rootkits_discussion]:
-
http://archives.neohapsis.com/ar ... e/2005-q2/0291.html
[Rowland_1996]: Covert Channels in the TCP/IP Protocol Suite (1996)
- Craig H. Rowland
-
http://gray-world.net/papers/ccintcpip.txt
[RSA_CC]: What are covert channels?
-
http://www.rsasecurity.com/rsalabs/node.asp?id=2351
[Unwitting_2003]: New covert channels in HTTP: adding unwitting Web browsers to
anonymity sets (2003)
- M. Bauer
-
http://google.that.paper
[WIKIPEDIA_CC]: Covert channel Wikipedia definition
-
http://en.wikipedia.org/wiki/Covert_channel
[WLAN_STEALTH_2005]: WLAN and Stealth Issues (2005)
- L. Oudot
-
http://www.blackhat.com/presenta ... /BH_EU_05-Oudot.pdf