Discussion:
[john-users] Data Sets
Emma Gadzama
2018-11-22 10:41:21 UTC
Permalink
I am conducting a research on improved password strength metrics. Could you
please avail me free password dataset to support my research about
passwords.

Thank you

Emmanuel H. Gadzama
Solar Designer
2018-11-22 11:14:28 UTC
Permalink
Hi,
Post by Emma Gadzama
I am conducting a research on improved password strength metrics. Could you
please avail me free password dataset to support my research about
passwords.
This isn't exactly a john-users topic. Please consider joining the
passwords mailing list and bringing this up in there:

https://www.openwall.com/lists/passwords/

Also, you don't appear to be subscribed to john-users - or at least not
under this address (I used my list admin powers to find out). Please
consider subscribing so that you don't miss replies and are able to
participate in discussions (without creating new threads for each reply,
which would make at least me angry).

As to your actual question, one of the commonly used password lists is
RockYou, downloadable from here:

https://wiki.skullsecurity.org/Passwords

There's also the newer and larger Pwned Passwords list, but it's in the
form of fast hashes that you need to (re-)crack, not plaintexts:

https://haveibeenpwned.com/Passwords

Of course, people already re-cracked nearly all of those hashes, but I'm
unaware of the results of such work being freely redistributed. You can
probably re-crack them quickly by using plaintext lists from:

https://hashes.org

If/when you bring this up on the passwords list, I suggest you also
explain your research project - chances are people will tell you how
it's already been done or/and is inferior to what's already been done,
and then you could try to come up with something novel or/and improved
and re-focus your project accordingly. There's a lot of work in this
area, so innovating is non-trivial.

Alexander
Matt Weir
2018-11-22 15:19:19 UTC
Permalink
To follow on with what Alexander said, one challenge is that many
researchers have restrictions on sharing password lists, even if the
lists are publicly available somewhere else, due to the sensitive
nature of them.

Some things that help if you can provide in your request:
1) Naming the research institution, (if applicable) that you are associated with
2) Stating that you are the lead professor with a link to your bio, or
naming the lead professor
3) Documentation that you have gone through an IRB, (or IRB like
process if you are working outside academia)
4) As silly as it sounds, sending your request from a .edu e-mail address

Now admittedly much of the above only applies if you are at a research
institution. If you are not, then the links that Alexander provided
above are a great starting point. Unfortunately (from a defender's
perspective), finding password lists online is generally pretty easy

One thing that having a discussion on the passwords list would help,
(which Alexander mentioned), is people can provide tips as to the
pluses and minuses of each of the lists. Due to the nature of which
most of them were obtained, each list has its own peculiarities and
most of them are fairly messy in their own unique way.

Good luck!
Matt Weir
Post by Solar Designer
Hi,
Post by Emma Gadzama
I am conducting a research on improved password strength metrics. Could you
please avail me free password dataset to support my research about
passwords.
This isn't exactly a john-users topic. Please consider joining the
https://www.openwall.com/lists/passwords/
Also, you don't appear to be subscribed to john-users - or at least not
under this address (I used my list admin powers to find out). Please
consider subscribing so that you don't miss replies and are able to
participate in discussions (without creating new threads for each reply,
which would make at least me angry).
As to your actual question, one of the commonly used password lists is
https://wiki.skullsecurity.org/Passwords
There's also the newer and larger Pwned Passwords list, but it's in the
https://haveibeenpwned.com/Passwords
Of course, people already re-cracked nearly all of those hashes, but I'm
unaware of the results of such work being freely redistributed. You can
https://hashes.org
If/when you bring this up on the passwords list, I suggest you also
explain your research project - chances are people will tell you how
it's already been done or/and is inferior to what's already been done,
and then you could try to come up with something novel or/and improved
and re-focus your project accordingly. There's a lot of work in this
area, so innovating is non-trivial.
Alexander
Loading...