[john-users] Is john really using Narayanan and Shmatikov whitepaper?

Discussion:

Matlink

2018-03-16 13:23:04 UTC

Hello John users,

I was wondering, for research purposes, how John was applying Markov
chains for its markov mode.

Before reading the source code, I would have some enlightenment about it.

After having read this whitepaper
http://www.cs.utexas.edu/~shmat/shmat_ccs05pwd.pdf, I found some strange
things about their index function. Accordingly to their 'get_key*'
functions, they are iterating over the alphabet in increasing order. I
guess then it means they are doing bruteforce excepted that they apply a
threshold on password candidate probabilities. Intuitively, Markov
chains-based enumerators would iterate over order of decreasing
probabilities of ngrams.

However, while running John I found that it's not iterating the alphabet
in any kind of order. So it doesn't look like a smart bruteforce. I
guess John authors have made many improvements since.

Thanks for any kind of suggestions,

--
Matlink - Sysadmin matlink.fr
Sortez couverts, chiffrez vos mails : https://café-vie-privée.fr/
XMPP/Jabber : ***@matlink.fr
Clé publique PGP : 0x186BB3CA
Empreinte Off-the-record : 572174BF 6983EA74 91417CA7 705ED899 DE9D05B2

Solar Designer

2018-03-16 14:35:12 UTC

Permalink

Hi,

Post by Matlink
I was wondering, for research purposes, how John was applying Markov
chains for its markov mode.

I think the closest to documentation of Markov mode's internals is this
presentation, by Simon who contributed this mode to jumbo:

http://www.openwall.com/presentations/Passwords12-Probabilistic-Models/

I also want to say once again that if you do any research involving
JtR jumbo's Markov mode, please also include in your research JtR's
incremental mode, which pre-dates Markov mode and isn't named Markov but
is also closely related.

Unfortunately, the Markov vs. incremental mode comparison in the above
presentation uses JtR 1.7's incremental mode, whereas 1.8's should
perform better.

Alexander

Matt Weir

2018-03-16 17:32:29 UTC

Permalink

As a note of caution, I'm 99% sure that Simon's presentation at
Passwords12 was about advancements to JtR's Markov model that could be
made vs. what's actually implemented, though the sections about
nbparts, (slides 7-14), should apply.

A good reference that I had mentioned to Matlink privately, but I
figure I should post here for everyone else, is the JtR Wiki write-up
on Markov mode: http://openwall.info/wiki/john/markov

Side note, I had completely forgot about Simon's work to provide an
accurate index function for JtR Markov that he detailed in his
presentation which would be useful to the previous thread Matlink had
brought up :)

Alexander, talking about incremental, are you aware of any write-up
for it behind it besides "look at the code?" Something like the JtR
Wiki article for Incremental might help other researchers. In practice
I've found Incremental is much more effective then Markov so I tend to
steer people to it as well, but there's not much out there to
understand what's going on under the hood. I know I've struggled with
that, (and made mistakes describing it).

Matt

Post by Solar Designer
Hi,

Post by Matlink
I was wondering, for research purposes, how John was applying Markov
chains for its markov mode.

I think the closest to documentation of Markov mode's internals is this
http://www.openwall.com/presentations/Passwords12-Probabilistic-Models/
I also want to say once again that if you do any research involving
JtR jumbo's Markov mode, please also include in your research JtR's
incremental mode, which pre-dates Markov mode and isn't named Markov but
is also closely related.
Unfortunately, the Markov vs. incremental mode comparison in the above
presentation uses JtR 1.7's incremental mode, whereas 1.8's should
perform better.
Alexander

Matlink

2018-03-16 17:47:20 UTC

Permalink

Post by Matt Weir
As a note of caution, I'm 99% sure that Simon's presentation at
Passwords12 was about advancements to JtR's Markov model that could be
made vs. what's actually implemented, though the sections about
nbparts, (slides 7-14), should apply.
A good reference that I had mentioned to Matlink privately, but I
figure I should post here for everyone else, is the JtR Wiki write-up
on Markov mode: http://openwall.info/wiki/john/markov
Side note, I had completely forgot about Simon's work to provide an
accurate index function for JtR Markov that he detailed in his
presentation which would be useful to the previous thread Matlink had
brought up :)
Alexander, talking about incremental, are you aware of any write-up
for it behind it besides "look at the code?" Something like the JtR
Wiki article for Incremental might help other researchers. In practice
I've found Incremental is much more effective then Markov so I tend to
steer people to it as well, but there's not much out there to
understand what's going on under the hood. I know I've struggled with
that, (and made mistakes describing it).
Matt

Post by Solar Designer
Hi,

Post by Matlink
I was wondering, for research purposes, how John was applying Markov
chains for its markov mode.

I think the closest to documentation of Markov mode's internals is

this
http://www.openwall.com/presentations/Passwords12-Probabilistic-Models/

Post by Solar Designer
I also want to say once again that if you do any research involving
JtR jumbo's Markov mode, please also include in your research JtR's
incremental mode, which pre-dates Markov mode and isn't named Markov

but

Post by Solar Designer
is also closely related.
Unfortunately, the Markov vs. incremental mode comparison in the

above

Post by Solar Designer
presentation uses JtR 1.7's incremental mode, whereas 1.8's should
perform better.
Alexander

Well I found the open wall wiki about Markov mode not detailed enough to be reproducible. Only the general concept is presented, alongside with how to use tools.
I would also be interested on some documentation about the incremental mode, to re-implement it.

--
Matlink

Solar Designer

2018-03-16 19:07:05 UTC

Permalink

Post by Matt Weir
Alexander, talking about incremental, are you aware of any write-up
for it behind it besides "look at the code?" Something like the JtR
Wiki article for Incremental might help other researchers.

I'm not aware of any documentation of incremental mode's internals.
Since I still have an unimplemented idea for improving it, maybe I
should revisit this if I have time in a few years from now (I doubt I'd
have time for this sooner than that), and give a talk on it at that
point (as I'd need to recall a few things anyway) - although it'd be a
bit weird to do that on something that is already 20 years old, except
for those improvements.

Alexander

Matlink

2018-03-20 16:08:22 UTC

Permalink

Post by Solar Designer
I'm not aware of any documentation of incremental mode's internals.
Since I still have an unimplemented idea for improving it, maybe I
should revisit this if I have time in a few years from now (I doubt I'd
have time for this sooner than that), and give a talk on it at that
point (as I'd need to recall a few things anyway) - although it'd be a
bit weird to do that on something that is already 20 years old, except
for those improvements.

The thing is that if researchers have no explanation on how this mode
works, they are barely constrained not to use it because they won't be
able to interpret or understand their results with it.

Moreover, an implementation is always a subjective manner to resolve a
problem. Therefore, shortcuts are often applied due to the used
language, where they cannot be used in another language. If we have no
generic (language independent) algorithm (i.e. pseudocode) it might be
complicated to understand what the coder wanted to do.

Having such an explanation/documentation can benefit since many
contributors could question the way it has been implemented.

Currently I don't see any difference between incremental mode and a raw
order 2 Markov-chains-based model (or maybe letters positions are also
considered).

Mat