Skip to content

Commit

Permalink
most common name data and script
Browse files Browse the repository at this point in the history
  • Loading branch information
andrewflowers committed Nov 20, 2014
1 parent 8450a9b commit bef7964
Show file tree
Hide file tree
Showing 11 changed files with 969 additions and 0 deletions.
23 changes: 23 additions & 0 deletions most-common-name/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
### Most Common Name

This repo contains the code and data behind the story:

[Dear Mona, What’s The Most Common Name In America?](http://fivethirtyeight.com/features/whats-the-most-common-name-in-america/)

The main script file is `most-common-name.R`

There are four input files:

* `state-pop.csv` - Total population and Hispanic population by state.
* `surnames.csv` - Data on surnames from the U.S. Census Bureau, including a breakdown by race/ethnicity.
* `aging-curve.csv` - Data from the Social Security Administration on the chances that someone born in the decade shown was still alive in 2013: http://www.ssa.gov/oact/NOTES/as120/LifeTables_Tbl_7.html
* `adjustments.csv` - Taken directly from Lee Hartman's article: http://mypage.siu.edu/lhartman/johnsmith.html.

And five output files:

* `adjusted-name-combinations-list.csv` - Adjusted estimates for the most common full names.
* `adjusted-name-combinations-matrix.csv` - The same data from the file `adjusted-name-combinations-list.csv` but in matrix form. These are the estimates presented in the second (and final) table of the article.
* `independent-name-combinations-by-pop.csv` - Matrix of estimates for the top 100 most common first names by top 100 most common surnames. These were calculated using independent odds, and displayed in the first table presented in the article.
* `new-top-firstNames.csv` - Final estimated ranking of top first names.
* `new-top-surnames.csv` - Final estimated ranking of top surnames.

401 changes: 401 additions & 0 deletions most-common-name/adjusted-name-combinations-list.csv

Large diffs are not rendered by default.

21 changes: 21 additions & 0 deletions most-common-name/adjusted-name-combinations-matrix.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
"","FirstName","SMITH","JOHNSON","WILLIAMS","BROWN","JONES","GARCIA","RODRIGUEZ","MILLER","MARTINEZ","DAVIS","HERNANDEZ","LOPEZ","GONZALEZ","WILSON","ANDERSON","THOMAS","TAYLOR","LEE","MOORE","JACKSON"
"70","Michael",30915.3639758192,22110.6778477485,17925.9337513276,18014.1628889499,16704.0250659371,6737.02542789882,5142.43440518657,16126.4304410701,6123.39068830423,13689.9119923918,NA,NA,NA,9955.27353903254,9571.22569542216,9268.46993725383,8996.89951980019,8006.16407485502,10012.4406069517,7105.16680690511
"39","James",31289.7355674834,22465.8557872627,20778.8185093931,19133.358414323,16449.2650725092,2047.58475251771,1553.95626301222,14497.7725299916,1908.09637654612,15653.8861130525,NA,NA,NA,12066.0225851187,10229.0803449309,9920.97968207131,10844.9842637353,8789.3773351676,10921.6922635898,9029.3206088283
"44","John",18715.2588143273,12576.6238204357,14458.1874809372,9491.51836762503,10511.9336476099,4175.73777549329,3561.59668036704,12369.1993434729,4084.62759372798,10736.8236342261,NA,NA,NA,8311.00720172512,8349.66905631504,7910.78066433961,7503.30428117198,6801.60239746484,7491.87125931207,5159.56639804023
"83","Robert",26093.6696272491,20167.693770498,15786.2362503737,16188.5384092428,15801.7386835285,5194.8708291319,4468.69504894458,14510.7052720511,4943.88815746683,11739.9634768221,NA,NA,NA,9706.45928290549,9733.44302878135,7740.14547270793,10207.7328241518,8158.65804364172,8316.28521120682,6876.34423040669
"24","David",25056.0603177346,18568.1244433312,13967.1754802908,14729.3797304364,13881.077985652,7983.94191750583,7372.32080934897,13490.2958542648,7416.44445241449,5705.11629768361,NA,NA,NA,8439.30741462823,9248.67056146581,7020.15040389554,6804.46053660144,7548.62493387168,7272.74012027606,5580.63854904201
"99","William",22067.113773237,15659.7907633039,5157.34425436552,13668.1382042484,12767.7291797417,1701.68742228137,2330.65068014757,10813.3678403267,1718.81389152811,10628.4854350207,NA,NA,NA,7621.11741555248,6465.28860673598,6975.67160241897,7469.5414411967,5083.79709263975,7378.83718126914,6050.59444162959
"67","Mary",19835.2132804905,15324.3228373496,13031.4244819925,11549.7340055241,11342.9804422785,5060.05781412621,3797.01038289754,9800.03527587973,4656.1636818867,9056.94415461672,NA,NA,NA,6814.76214636444,6526.01530810198,6326.12161229861,6127.95954667401,4383.15912039854,6277.22898955198,5483.31893317764
"21","Christopher",NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
"48","Joseph",9905.39108334182,7174.45941342625,6709.13769253622,6148.28000993819,4878.92076819194,3766.96761103243,3082.37558260986,5813.38843132476,3746.48564529878,4575.12938539398,NA,NA,NA,3350.98813564738,2847.98574632943,3893.65097177473,3180.54951615275,3052.94076492498,3219.15788011466,2953.28450449945
"82","Richard",14356.8524451693,10510.3083782287,7475.9888063436,8294.42316454356,7539.52463257213,3900.70039783559,3374.79117910323,7825.00107576039,3534.48592724392,6488.08909520013,NA,NA,NA,4556.70658126487,5304.49150105711,4063.60521458112,4336.19159109949,3828.15134506757,4287.76120270099,3091.65688155537
"23","Daniel",12316.7780944874,8472.57184174991,6172.11684967119,6483.08042701655,5688.9414881452,8916.68998709151,7689.90476128376,7265.84061864957,7937.97424597201,4611.42045608288,NA,NA,NA,3707.55260827581,3355.78753231111,2986.01239806311,4814.30969620907,3887.88522152229,3514.03450777333,2565.77136880745
"96","Thomas",12122.6742791561,8392.91970123823,7339.02860002431,7482.09431260294,7021.46382136281,1403.62790193766,938.657301545009,6554.07377609739,1392.98354704502,5343.12639739124,NA,NA,NA,4525.23875763118,4092.92568352602,723.741554324375,3965.59384318073,3437.67930190306,4863.70340078104,3231.24577334247
"68","Matthew",NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
"42","Jennifer",12786.8094162309,9587.85619148557,6760.17833327087,6890.91455180973,7942.80482425029,3734.85198512614,3493.64115496868,6383.10044846552,3407.51448804857,5433.59450854971,NA,NA,NA,4168.96851485375,4103.42502933492,3317.17710838062,3845.88070289281,4069.44260540816,3611.28988724838,2994.82488550718
"18","Charles",14977.7364863173,10714.6834148205,9611.60854198764,8940.53958855659,8256.22417737215,896.101570642626,594.043751400778,6879.92740141814,769.411148078419,7089.74933156643,NA,NA,NA,5225.94744790512,4343.1269892148,4561.55470560742,4667.98267779497,3432.01583282427,4766.42223652668,4072.30513435898
"9","Anthony",NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
"77","Patricia",11205.4787595933,8130.50751007365,7051.54819559552,6509.71419346922,6220.69886800103,4083.30994535665,3503.31655925871,5396.07188755839,3647.46578301555,4880.69501467587,NA,NA,NA,3384.02320235856,3382.0365551892,3412.23695956974,3490.2962619542,2236.84248335889,3366.01952816709,3036.98693226942
"62","Linda",12092.3871413313,9247.16067328123,7285.96356716985,6681.56538038965,6714.39475381551,2447.51357394225,1969.25343313059,5927.6774072773,2175.14665328952,5382.55632903655,NA,NA,NA,3910.36799242412,3809.25155959413,3425.72249945994,3725.86633274598,3062.37921608242,3624.90418083418,3211.99295442603
"66","Mark",9368.70696605657,8189.30425322347,5558.58805597028,4570.49562730225,4382.62989491142,1508.69831211492,1232.88145178821,5964.8689706833,1527.38081379791,4405.49784213841,NA,NA,NA,3639.01752851825,4192.19400520269,2838.14785350277,3051.66142742583,1811.77602332064,2045.74460158949,2010.93578090779
"31","Elizabeth",9077.11944668312,6818.80149092169,5120.19002126446,5091.43430739161,4930.82711575234,5273.10417119864,5218.83639602196,4716.66369632935,5051.94877383388,3974.14089649862,NA,NA,NA,3191.86072712951,3060.47286828708,2655.43574284727,2772.88905866037,2159.42185101166,2859.91106403937,2129.71214419378
1 change: 1 addition & 0 deletions most-common-name/adjustments.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
,Miller,Anderson,Martin,Smith,Thompson,Wilson,Moore,White,Taylor,Davis,Johnson,Brown,Jones,Thomas,Williams,Jackson,Lee,Garcia,Martinez,RodriguezJohn,8.4,6.9,13.4,-23.5,3.6,2.5,3.6,1.7,0.4,-3.8,-35.2,-34.3,-26.5,5.1,-11.3,-27.7,-8.3,-66.4,-63.6,-69.5Michael,18.1,2.4,13,5.6,6.7,2.6,15.7,-1.7,0.6,2.5,-4.8,4.2,-2.4,2.9,-8.1,-16.8,-9.8,-54.7,-54.4,-63.2James,20.3,24,35.9,21.1,33.6,40.9,43,41,37.4,32.8,9.6,25.4,8.9,24.8,20.7,19.8,12.2,-84.4,-83.9,-87.4Robert,29.6,27,16.8,8.7,18,22,17.2,18.1,39.2,7.2,5.9,14.2,12.6,4.8,-1.3,-1.8,12.1,-57.4,-55.1,-61David,27.9,28.1,15.8,10.8,11,12.6,8.8,16.2,-1.5,-44.7,3.5,10.3,5,0.9,-7.3,-15.4,10.1,-30.5,-28.5,-31.7Mary,21.5,18.2,26.2,14.7,18.5,18.9,22.8,22.5,16,14.8,11.7,13.1,12.2,18.9,13.1,8.7,-16.4,-42.4,-41.3,-54William,22.5,7,33.3,16.6,29.3,21.5,31.9,38.2,29.2,23.1,4.3,22.3,15.4,19.8,-59.1,9.6,-11.4,-82.3,-80.2,-74.2Richard,24.1,22.9,12.6,6.2,8.4,1.7,7.3,12.2,5,5.2,-2,3.9,-4.6,-2.3,-17,-21.6,-6.6,-43.2,-43,-47.7Thomas,12.9,3,22.1,-2.6,-3.5,9.7,32.2,13.2,4.3,-5.9,-15,1.8,-3.5,-81.1,-11.5,-11,-8.9,-77.8,-75.6,-84.2Jennifer,24.9,17.3,22.5,16.7,15.2,14.8,11.5,14.8,14.9,8.7,10.3,6.5,24,-1.6,-7.4,-6.3,22.5,-32.9,-32.2,-33.2Patricia,20.9,10.7,25.3,17.1,16.7,6.7,19,17.2,19.4,11.8,7.1,15.2,11.2,15.9,10.6,8.8,-22.9,-16,-16.9,-23.3Joseph,-8.9,-34.8,0.1,-27.6,-24.8,-26.1,-20.4,-10.6,-23.9,-26.7,-33.9,-23.9,-39,-7.5,-26.4,-26,-26.4,-45.8,-40.3,-52.8Linda,34,25.8,29.6,27.5,24.8,24.4,29.3,26.1,28.6,24.4,22.9,19.3,21.1,17.4,15.3,16.1,6.5,-49.2,-50,-56.5Maria,-77.6,-77.3,-52.1,-78.8,-76.4,-77.4,-79.3,-77.7,-80.3,-79.7,-78.3,-78.5,-79.4,-75.6,-79.4,-80.9,-77.5,663.9,614.1,639.8Charles,36.3,25.7,29.3,38.4,38.1,45.7,49,43.4,41.2,43.6,24.8,39.9,30.5,37,33.3,29,4.6,-83.7,-84.5,-88.5Barbara,35.5,25.3,25.1,24,20.6,25.8,24.6,25.2,24.7,24.2,20.3,26.7,16.2,16.6,14.8,18.5,-18.7,-66.9,-67.5,-67.9Mark,42.1,45.9,-38.5,4.1,32.4,22,-23.1,-16.8,11,7.3,14.7,-14,-16.7,2.5,-7.3,-23.4,-33.6,-67,-63,-71.3Daniel,19.9,-19.1,10.3,-5.2,-6.4,-13.9,-8.5,-5,21.3,-22.2,-17.8,-15.5,-25.1,-25.3,-28.7,-32.3,-1.3,35.1,33.2,24Susan,32.1,28.4,15.8,3.9,5.4,3.5,3.5,-0.9,3,-5.8,-6.5,-4.6,-17.1,-1.5,-24.2,-31.9,3.7,-68.2,-68,-71.7Elizabeth,13.3,7.4,13.5,1.7,9.1,7.9,8.4,4,1.7,-2.4,-3.7,-3.4,-5.5,-3.3,-13.9,-18.2,-20.2,16.3,23.4,22.5
Expand Down
1 change: 1 addition & 0 deletions most-common-name/aging-curve.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Decade,Age,Male,Female,Male,Female1900,113,0,0,0,01910,103,45,318,0.00045,0.003181920,93,4154,11403,0.04154,0.114031930,83,28836,44336,0.28836,0.443361940,73,58728,70955,0.58728,0.709551950,63,78170,86389,0.7817,0.863891960,53,87064,92468,0.87064,0.924681970,43,92710,95619,0.9271,0.956191980,33,96010,97619,0.9601,0.976191990,23,97802,98570,0.97802,0.98572000,13,99003,99178,0.99003,0.991782010,3,99348,99449,0.99348,0.99449
Expand Down
Loading

0 comments on commit bef7964

Please sign in to comment.