Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>The U.S. Census Bureau has <a href="http://www.census.gov/topics/population/genealogy/data/1990_census/1990_census_namefiles.html" rel="noreferrer">three lists</a> generated from a 1990 census:</p> <ul> <li><a href="http://www2.census.gov/topics/genealogy/1990surnames/dist.all.last" rel="noreferrer">dist.all.last</a> [2MB; 88799 entries]</li> <li><a href="http://www2.census.gov/topics/genealogy/1990surnames/dist.female.first" rel="noreferrer">dist.female.first</a> [146k; 4275 entries]</li> <li><a href="http://www2.census.gov/topics/genealogy/1990surnames/dist.male.first" rel="noreferrer">dist.male.first</a> [41k; 1219 entries]</li> </ul> <p>(These have the same counts as from another answer that links to deron.meranda.us)</p> <p>Quoting the link above:</p> <blockquote> <p>Each of the three files, (dist.all.last), (dist. male.first), and (dist female.first) contain four items of data. The four items are:</p> <p>A "Name" Frequency in percent Cumulative Frequency in percent Rank In the file (dist.all.last) one entry appears as:</p> <pre><code> MOORE 0.312 5.312 9 </code></pre> <p>In our search area sample, MOORE ranks 9th in terms of frequency. 5.312 percent of the sample population is covered by MOORE and the 8 names occurring more frequently than MOORE. The surname, MOORE, is possessed by 0.312 percent of our population sample.</p> </blockquote> <p>Googling around, it seems this data has been further refined into a single list of 5163 entries (<a href="http://answers.google.com/answers/threadview/id/107201.html" rel="noreferrer">link 1</a>, <a href="http://ant-village.googlecode.com/hg-history/f506cf9ea6969db1de76241ed96449d96c6427c8/src/main/java/org/antvillage/simulator/names.txt" rel="noreferrer">link 2</a>), in the <a href="http://answers.google.com/answers/threadview/id/107201.html" rel="noreferrer">format</a>:</p> <blockquote> <pre><code> &lt;namestyle&gt; &lt;first/last indicator&gt; &lt;name&gt; </code></pre> </blockquote> <p>Namestyle code:</p> <ul> <li>MF: used as male or female</li> <li>MO: used as male only </li> <li>FO: used as female only</li> </ul> <p>First/Last indicator: </p> <ul> <li>LY: Used as a last name </li> <li>LN: Not used as a last name</li> </ul> <p>E.g:</p> <blockquote> <pre><code> MF LY AARON FO LY ABBEY FO LN ABBIE FO LY ABBY </code></pre> </blockquote> <p><strong>UPDATE 1</strong>: Slightly off topic from original post, but it may be of use to others finding this. If you are looking for something more involved (not just person names, but the gender of many nouns and phrases), you can look at the corpus created by Shane Bergsma and Dekang Lin. <a href="http://conll.cemantix.org/2012/download/gender.data.gz" rel="noreferrer">The data is available as a single gzip file</a> from <a href="http://conll.cemantix.org/2012/introduction.html" rel="noreferrer">the CoNLL shared task</a>.</p> <p><strong>UPDATE 2</strong>: www.census.gov restructured their website, so I updated links to reflect the files' new locations.</p> <p><strong>UPDATE 3</strong>: www.census.gov also has a <a href="http://www.census.gov/topics/population/genealogy/data/2000_surnames.html" rel="noreferrer">survey from 2000</a> for surnames occurring 100 or more times, containing a total of 151,671 names (<a href="http://www2.census.gov/topics/genealogy/2000surnames/names.zip" rel="noreferrer">direct link to zip</a>).</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload