Please note that the following was written some time ago, when the described flaw was only freshly fixed. I found the document recently, and have cleaned up the text in minor places; other than that it is as written in early 2004
MSN Messenger inconsistent UTF-8 handling
(Or, How I Learned To Start Worrying and Hate UTF-8)
UTF-8 is a godsend. It allows encoding of foreign (including non-Latin)
characters into a file that’s still compatible with US-ASCII for all
English-only use. Editors that don’t understand it just show garbage
where there’s a foreign character, and handle the “normal” A-Z as they
You just have to remember when you’re using it and when you’re not.
I’d like to point out that this write-up is about vulnerabilities
eliminated almost a year ago, and is thusly entirely uncheckable now.
UTF-8 in Messenger
UTF-8 is used in most places in Messenger, which is great. What you
have to remember is what you consider UTF-8 text and what you don’t,
and to remember to treat UTF-8 text consistently. The problem with
Messenger’s UTF-8 support was in its handing of Display Names. MSN 4.x
(and probably 5.x) were never affected, since they didn’t seem to
support UTF-8 display names (only messages). MSN 6 was another issue
MSN 6 supported UTF-8 display names. In a couple of functions, it even
checked that the display name was valid. One of those functions was the
callback for when a “RNG” is received. RNGs are sent by the server to
the client as an invitation to a chat. If MSN6 thought the display name
was empty, the “ring” was never answered, and so the client never
joined the chat. With the advent of UTF-8, this left open a small
loophole – the two-byte display name “300200″ is a piece of invalid
UTF-8 representing the “NUL” character that Microsoft’s software -
against the UTF-8 specification – translates to “NUL”. Invalid UTF-8 is
supposed to be rejected, and for good reason.
Server-side, when an empty display name was set, the server would not
accept it. However, the server wasn’t using a UTF-8-understanding
string comparison function, so when it saw “300200″ it thought that
the name was non-empty, and allowed it. This has now been partially
fixed, resolving the problem this document describes.
When MSN6 sees “300200″ as the display name, it gets decoded from
UTF-8 into standard ASCII and compared to “” (a blank string). It is
considered a “match”, despite the fact that they are technically
different. Thus, “300200″ as a display name would stop MSN6 from
answering a RNG.
The mis-handling of a small piece of UTF-8 text, fixed to one line
long, seems unimportant. However, due to MSN6′s erroneous assertion
that the name is empty – something that is supposed to be impossible,
due to the assumption that the server would filter out empty names -
there is suddenly a serious problem.
The problem is that the RNG is never answered. Any method to force a
client to ignore a RNG is very dangerous, because it exposes a flaw in
another part of the MSN spec – the CKI challenges.
In order to join a chat, the client must supply two things – an email
address, and a “CKI” key, which is an automatically generated
“password” to allow entry into the switchboard session. The CKI is
(currently) just the current time in unix-epoch-style format, and what
appears to be a random unsigned 15-bit number. Because the time is easy
to guess, the only real “security” to stop one user joining a room as
another user is the random number.
15-bits cannot be brute-forced in the time it normally takes a client
to respond to a RNG, however if the client can be stopped from joining,
the invite lasts a lot longer than the usual 2-3 seconds. (I tested it
up to around five minutes). All it would take to join a chat as another
online user is to set the display name to “200300″, invite the user
into a switchboard, then connect yourself and authenticate as that
user, brute-forcing the random part of the CKI.
Microsoft already solved enough of the problem so that this exact
attack is no longer possible – the server now rejects UTF-8 that
resolves to an empty string. However, more could still be done. First,
the CKIs – which are considered opaque strings by clients – should be
changed or extended so that it is not feasible to brute-force it even
with days of use.
The idea of using invalid UTF-8 wasn’t mine; when looking at MSN security
I found some old Windows Messenger-era clients that could be “invisible” in a
groupchat by using invalid names. Although no longer functional, further
experimentation with the idea brought this problem to light. I also found that
at least some MSN versions would log out if they saw a user come online with
an invalid UTF-8 string as their display name.
The level of simplicity with which one could masquerade as another user
in this case brings to light trust issues with Microsoft’s
implementation of instant messaging. The time that this exploit would have been available is also concerning
- it could have affected users from the release of MSN 6.0 to its
resolution over six months later. What else is possible, and how long
will people be unknowingly at risk before that is fixed? Can you trust a
homogeneous environment that you can’t see into?