Solution to this 'l33t' rules problem?

Discussion:

Minga Minga

2010-10-19 21:52:05 UTC

So heres something that I can't figure out, take the word:

neglected

And place it into a wordlist.

and run a command such as:

# ./john -w:neglected.dic --rules:korelogicrulesl33t -stdout | grep -i ^n3gl3

You get words such as :

n3gl3ct3d N3gl3ct3d n3gl3c+3d N3gl3c+3d

But how would you go about cracking the passwords:

N3gl3cted n3gl3cted Negl3cted Negl3ct3d

Notice that _NOT_ all of the e's are turned into 3s. I've started to see a few
of these passwords that Ive missed previously, and I totally should have been
able to crack them.

Any ideas? The problem obviously isn't with just 'e's but _all_
"l33t" translations.

What about 'mississippi' ? The 'l33t' rules should be able to generate
passes like;
mis$iss1ppi (Notice how one of the s's is changed - and only one of
the i's is changed
as well).

I got the idea for this from the list of NTLM hashes not cracked from the DEFCON
contest.

-Rick / Minga
KoreLogic

Rich Rumble

2010-10-19 22:51:48 UTC

Permalink

N3gl3cted n3gl3cted Negl3cted Negl3ct3d
Notice that _NOT_ all of the e's are turned into 3s. I've started to see a few
of these passwords that Ive missed previously, and I totally should have been
able to crack them.

I suppose I need to test the l337 rules again, they would substitute all the e's
in my users passwords, and I wanted it to do all the combination's, to which
Solar responded: http://www.openwall.com/lists/john-users/2010/07/31/1
RememberMe was a pass I have encountered our helpdesk setting the passes
to, and users were just replacing some of the e's with 3's and not all.
I've created some additional rules along the lines of Solar's response for even
more letters and combination's of letters, to which I'm sure there are better
ways do it, I just look at the examples, get a cursory understanding of them
and rinse/repeat.

/e op3$
%2e op3 /e op[e3]
%3e op3 %2e op[e3] /e op[e3]
%4e op3 %3e op[e3] %2e op[e3] /e op[e3]

# Add 0-9 to those same rules
/e op3$[0-9]
%2e op3 /e op[e3]$[0-9]
%3e op3 %2e op[e3] /e op[e3]$[0-9]
%4e op3 %3e op[e3] %2e op[e3] /e op[e3]$[0-9]

/o op0
%2o op0 /o op[e0]
%3o op0 %2o op[e0] /o op[e0]
%4o op0 %3o op[e0] %2o op[e0] /o op[e0]

# Add 0-9 to those same rules
/o op0$[0-9]
%2o op0 /o op[e0]$[0-9]
%3o op0 %2o op[e0] /o op[e0]$[0-9]
%4o op0 %3o op[e0] %2o op[e0] /o op[e0]$[0-9]

I wonder if there is a better way to try 0-9 at the beginning and end of
these same words so they don't have to go through all the iterations again.

Sorry different topic... So to your question, I've seen the opposite using JtR's
default l337 rules, not your korelogic ones (not tested KL's ruleset). JtR's
being /ese3I typically look at the log file to see what JtR is converting rules
"into" and I use those, typically it removes spaces only.
- Rule #1378: '-c T1 Q M T0 Q' accepted as 'T1QMT0Q'
-rich

Brad Tilley

2010-10-19 23:04:52 UTC

Permalink

Post by Minga Minga
neglected
And place it into a wordlist.
# ./john -w:neglected.dic --rules:korelogicrulesl33t -stdout | grep -i ^n3gl3
n3gl3ct3d N3gl3ct3d n3gl3c+3d N3gl3c+3d
N3gl3cted n3gl3cted Negl3cted Negl3ct3d

Seems you would need a Cartesian product to cover all possibilities
(what about NegL3ctEd):

1 = nN
2 = eE3
3 = gG6
4 = lL17|
5 = eE3
6 = cC[
7 = tT+7
8 = eE3
9 = dD

Depending on you definition of leet, the sets may be bigger than what I
listed above, but you would want a CP of those sets to fully enumerate
the word "neglected" I think. I'm not sure JTR does this.

Brad

Post by Minga Minga
Notice that _NOT_ all of the e's are turned into 3s. I've started to see a few
of these passwords that Ive missed previously, and I totally should have been
able to crack them.
Any ideas? The problem obviously isn't with just 'e's but _all_
"l33t" translations.
What about 'mississippi' ? The 'l33t' rules should be able to generate
passes like;
mis$iss1ppi (Notice how one of the s's is changed - and only one of
the i's is changed
as well).
I got the idea for this from the list of NTLM hashes not cracked from the DEFCON
contest.
-Rick / Minga
KoreLogic

Charles Weir

2010-10-25 14:36:53 UTC

Permalink

This is a really interesting problem since it's certainly a needed
attack type to target more advanced passwords. As Brad pointed out,
the posted suggestion by Solar and Rich to generate replacement rules
can get really nasty when multiple replace types are implemented at
the same time, such as '***@s$word'. I spent a little bit of time
trying to write a script to auto-generate a JtR config file that would
contain those thousands, (if not more), of potential replacement rules
but I eventually ended up dropping that approach and instead decided
to make use of JtR's -stdin option.

On a side note, JtR's -stdin option is one of the main reasons I use
John. If you can think of a way to generate guesses, even if JtR's
built in mangling rules doesn't support it, you can always write your
own program/script to generate the guesses and still use JtR on the
back-end to handle all of the hashing.

To that end I wrote the program 'noobify', which is available at the
following link:

http://sites.google.com/site/reusablesec/Home/password-cracking-tools/noobify

Yes the name is a play on l33tify ;)

I tried to make it as customizable as possible. Below is an
explanation of a couple of the design choices:

1) As the name implies, it will apply every possible replacement rule
to an input word and then output it for JtR to hash. Aka the word
'noobify' could generate n0obify, n00bify, n00b1fy, n0ob1fy, no0bify,
no0b1fy, noob1fy ...

2) It reads in the replacements to use from a tab separated text file,
(the default file is replacements.txt). This allows the user to
specify the specific replacements they want to use, vs hard-coding
them in. For example, you can also do things like uppercase vowels if
you want by replacing 'a' with 'A', etc. This also allows the user to
have multiple replacement profiles for use in different cracking
sessions.

3) It replaces substrings instead of characters. This allows for
multi-character replacements. The most common 'l33t' example would be
replacing 'f' with 'ph', but it actually opens up a bunch of other
options as well. For example you can replace '2009' with '2010', or
'monday' with 'tuesday'. The replacements do not have to be the same
size.

4) Currently it works a lot like middlechild,
http://sites.google.com/site/reusablesec/Home/password-cracking-tools/middle-child
in that it will take input from stdin, and then output it to stdout.
Probably the most common use of it would be:

cat <input_dictionary> | ./noobify | ./john -stdin -format=<hashtype>
<target_hashes>

5) Currently it doesn't apply any other mangling rules. If you want to
apply additional mangling rules you do have a couple of different
options available to you. The first is to use noobify to generate a
custom input dictionary instead of piping the output directly into
John. Then you can use that input dictionary with all of your existing
JtR mangling rules. The second option is to chain the output into a
second mangling program like middlechild. For example, the below
command would also capitalize the first letter and append two digits
followed by one special character:

cat <input_dictionary> | ./noobify | ./middlechild -capFirst -append
D2S1 | ./john -stdin -format=<hashtype> <target_hashes>

The advantage of this approach is that it works for much larger input
dictionaries, since you don't have to save the mangled results to
file. The downside though is that it applies EVERY mangling rule to
each word before moving on to the next one. When you use John's rules
you can optimize it a lot more to try high probability rules first,
which will usually result in you cracking passwords much sooner in
your cracking session.

6) I practice "agile development", which is a nicer way of saying that
I try to get a proof of concept working before I focus on making it
'good'. Currently there's a lot of inefficiencies in the code that
make it run fairly slow compared to most other mangling rules, (the
biggest probably being in how I currently identify replacements in the
actual word). I should be able to fix some of them fairly soon, but
even in it's current state, at least you can make guesses which were
difficult to generate beforehand. Also as stated above, you can always
save the results to file, making the running time of this tool only a
one time cost.

7) I just realized I forgot to test it, but theoretically the tool
supports deletions as well, (in the replacement file just have the
value you want to delete followed by a tab and then a return). This
way you can do things like remove spaces and punctuation from a
passphrase input dictionary. If it crashes horribly when you try to do
this I should have a fix soon ;)

8) Yes the code is C++ so you will have to compile it. I realize I
should have written it as a script to make it easier to use, but I
like writing in C++. The code is GPL'd so feel free to modify it.

9) The default replacement file I included is just a proof of concept.
Aka I got tired after entering the different replacements in for
'january', aka january->february, etc, so I didn't do any of the other
months ;)

10) As an addendum to the previous point, be careful when you a
copying and pasting replacement rules, since many editors will paste
spaces instead of tabs.

As always, if you have any questions/comments/suggestions, please let me know.

Matt Weir
http://reusablesec.blogspot.com

Brad Tilley

2010-10-25 21:14:55 UTC

Permalink

Post by Charles Weir
This is a really interesting problem since it's certainly a needed
attack type to target more advanced passwords. As Brad pointed out,
the posted suggestion by Solar and Rich to generate replacement rules
can get really nasty when multiple replace types are implemented at
trying to write a script to auto-generate a JtR config file that would
contain those thousands, (if not more), of potential replacement rules
but I eventually ended up dropping that approach and instead decided
to make use of JtR's -stdin option.
On a side note, JtR's -stdin option is one of the main reasons I use
John. If you can think of a way to generate guesses, even if JtR's
built in mangling rules doesn't support it, you can always write your
own program/script to generate the guesses and still use JtR on the
back-end to handle all of the hashing.
To that end I wrote the program 'noobify', which is available at the
http://sites.google.com/site/reusablesec/Home/password-cracking-tools/noobify
Yes the name is a play on l33tify ;)
I tried to make it as customizable as possible. Below is an
1) As the name implies, it will apply every possible replacement rule
to an input word and then output it for JtR to hash. Aka the word
'noobify' could generate n0obify, n00bify, n00b1fy, n0ob1fy, no0bify,
no0b1fy, noob1fy ...
2) It reads in the replacements to use from a tab separated text file,
(the default file is replacements.txt). This allows the user to
specify the specific replacements they want to use, vs hard-coding
them in. For example, you can also do things like uppercase vowels if
you want by replacing 'a' with 'A', etc. This also allows the user to
have multiple replacement profiles for use in different cracking
sessions.
3) It replaces substrings instead of characters. This allows for
multi-character replacements. The most common 'l33t' example would be
replacing 'f' with 'ph', but it actually opens up a bunch of other
options as well. For example you can replace '2009' with '2010', or
'monday' with 'tuesday'. The replacements do not have to be the same
size.
4) Currently it works a lot like middlechild,
http://sites.google.com/site/reusablesec/Home/password-cracking-tools/middle-child
in that it will take input from stdin, and then output it to stdout.
cat <input_dictionary> | ./noobify | ./john -stdin -format=<hashtype>
<target_hashes>
5) Currently it doesn't apply any other mangling rules. If you want to
apply additional mangling rules you do have a couple of different
options available to you. The first is to use noobify to generate a
custom input dictionary instead of piping the output directly into
John. Then you can use that input dictionary with all of your existing
JtR mangling rules. The second option is to chain the output into a
second mangling program like middlechild. For example, the below
command would also capitalize the first letter and append two digits
cat <input_dictionary> | ./noobify | ./middlechild -capFirst -append
D2S1 | ./john -stdin -format=<hashtype> <target_hashes>
The advantage of this approach is that it works for much larger input
dictionaries, since you don't have to save the mangled results to
file. The downside though is that it applies EVERY mangling rule to
each word before moving on to the next one. When you use John's rules
you can optimize it a lot more to try high probability rules first,
which will usually result in you cracking passwords much sooner in
your cracking session.
6) I practice "agile development", which is a nicer way of saying that
I try to get a proof of concept working before I focus on making it
'good'. Currently there's a lot of inefficiencies in the code that
make it run fairly slow compared to most other mangling rules, (the
biggest probably being in how I currently identify replacements in the
actual word). I should be able to fix some of them fairly soon, but
even in it's current state, at least you can make guesses which were
difficult to generate beforehand. Also as stated above, you can always
save the results to file, making the running time of this tool only a
one time cost.
7) I just realized I forgot to test it, but theoretically the tool
supports deletions as well, (in the replacement file just have the
value you want to delete followed by a tab and then a return). This
way you can do things like remove spaces and punctuation from a
passphrase input dictionary. If it crashes horribly when you try to do
this I should have a fix soon ;)
8) Yes the code is C++ so you will have to compile it. I realize I
should have written it as a script to make it easier to use, but I
like writing in C++. The code is GPL'd so feel free to modify it.
9) The default replacement file I included is just a proof of concept.
Aka I got tired after entering the different replacements in for
'january', aka january->february, etc, so I didn't do any of the other
months ;)
10) As an addendum to the previous point, be careful when you a
copying and pasting replacement rules, since many editors will paste
spaces instead of tabs.
As always, if you have any questions/comments/suggestions, please let me know.
Matt Weir
http://reusablesec.blogspot.com

That's interesting Matt. I hard-coded a Cartesian product solution using
nested for loops. It's fast, but it does not scale as well as your
mangleGuess function would. Here's an example with 2 char passwords...
nest one more level for each additional char:

-------------
std::string one = "nN";
std::string two = "oO0@";

std::string attempt;

std::string::const_iterator it1_end (one.end()), it2_end (two.end());

for ( it1 = one.begin(); it1 != it1_end; ++it1 )
{
attempt.push_back(*it1);

for ( it2 = two.begin(); it2 != it2_end; ++it2 )
{
attempt.push_back(*it2);
std::cout << attempt;
attempt.resize(1);
}
attempt.clear();
}
-------------

I'm guessing this would work with JTR's stdin option. I'll try it.

Brad

Brad Tilley

2010-10-25 23:30:11 UTC

Permalink

Following Matt's example, I quickly wrote my own leet generator. I just
did all the possibilities rather than make it customizable or flexible.
Again, these sets depend on your idea of what leet is:

b = "bB86"
r = "rR2"
a = "aA4@"
d = "dD"

4 * 3 * 4 * 2 = 96

Here are my leetified strings (I'm pretty sure this is a complete set of
possibilities):

echo brad | ./possibilities
brad
braD
brAd
brAD
br4d
br4D
***@d
***@D
bRad
bRaD
bRAd
bRAD
bR4d
bR4D
***@d
***@D
b2ad
b2aD
b2Ad
b2AD
b24d
b24D
***@d
***@D
Brad
BraD
BrAd
BrAD
Br4d
Br4D
***@d
***@D
BRad
BRaD
BRAd
BRAD
BR4d
BR4D
***@d
***@D
B2ad
B2aD
B2Ad
B2AD
B24d
B24D
***@d
***@D
8rad
8raD
8rAd
8rAD
8r4d
8r4D
***@d
***@D
8Rad
8RaD
8RAd
8RAD
8R4d
8R4D
***@d
***@D
82ad
82aD
82Ad
82AD
824d
824D
***@d
***@D
6rad
6raD
6rAd
6rAD
6r4d
6r4D
***@d
***@D
6Rad
6RaD
6RAd
6RAD
6R4d
6R4D
***@d
***@D
62ad
62aD
62Ad
62AD
624d
624D
***@d
***@D

There are: 96 leet possibilities for the word: brad

Keep in mind that total possibilities depends on the size of each
individual set. If I add one character to a set, then the possibilities
go up.

Brad

Brad Tilley

2010-10-25 23:50:08 UTC

Permalink

One last note on this... Getting back to the OP's original word, here
are my results. Being thorough may be a bit too much especially if you
have a lot of words to leetify:

There are: 19440 possibilities for the word: neglected

Brad

Charles Weir

2010-10-26 14:19:25 UTC

Permalink

Post by Brad Tilley
Being thorough may be a bit too much especially if you
have a lot of words to leetify
There are: 19440 possibilities for the word: neglected

I fully agree with you Brad. This might be about time to spin off a
second thread, but now that we have some ways of generating full
replacement guesses, the next question of course is what replacements
are the best to use. This is especially true since there are other
mangling rules to consider as well. For example, if you wanted to add
two digits to the end of a guess in addition to doing full mangling,
in the case above the word 'neglected' would generate 1,944,000 unique
guesses. With a small to medium sized dictionary and a quick hash like
MD5, that's still doable, but we might want a smaller subset of
replacements to use in other cases.

I did some research a while ago trying to measure the frequency of
different replacements and identify new replacements using
edit-distance calculations, (if you're REALLY bored I have a short
write-up of what I did in chapter 3.3 of my dissertation). That
research desperately needs to be updated on some of the new datasets
I've collected. I also need to spend some time improving my analysis
tool so I can give it to other people to run on non-public datasets,
(and so it catches more mangling rules). Whether I actually get around
to doing that in the near future is iffy though, (especially after my
main computer suffered an unfortunate accident. Thank god for
backups).

In my limited testing, by far the most common replacements, (in
frequency order), were:
i->1
e->3
o->0
a->@
s->$
l->1
t->+

I need to go back and manually look for some of the less common replacements.

The other use of replacements though would be to mangle a dataset of
previously cracked passwords when targeting new password hashes. This
is actually what I've been working on recently and prompted some of my
previous posts to this list such as:

http://article.gmane.org/gmane.comp.security.openwall.john.user/3157/

This was actually inspired by a great paper presented in CCS by
Yinqian Zhang, Fabian Monrose and Michael Reiter titled: "The Security
of Modern Password Expiration: An Algorithmic Framework and Empirical
Analysis", which is available for download here:

http://www.cs.unc.edu/~yinqian/password.html

That's also why I've been looking at string replacements such as
replacing 2009 with 2010. Where this is also useful though is for
updating wordlists of previously cracked passwords, such as the
RockYou list. It would be nice to change all of those '2009's into
'2010' and soon '2011'. It also leads itself to targeted based
cracking sessions. If you know your target likes to use certain l33t
replacements, you really want to include those specific replacements
in future cracking sessions as well.

Matt