What we learned from 5 million books - Erez Lieberman Aiden and Jean-Baptiste Michel

everyone<00:00:17.000> knows<00:00:17.720> that<00:00:17.880> a<00:00:18.119> picture<00:00:19.119> is<00:00:19.359> worth<00:00:19.880> a
everyone knows that a picture is worth a everyone knows that a picture is worth a thousand thousand thousand words<00:00:22.880> but<00:00:23.240> we<00:00:23.480> at<00:00:23.720> Harvard<00:00:24.720> were<00:00:25.039> wondering words but we at Harvard were wondering words but we at Harvard were wondering if<00:00:25.760> this<00:00:25.960> was<00:00:26.279> really if this was really if this was really true<00:00:29.439> so<00:00:29.640> we<00:00:30.000> assembled<00:00:30.359> a<00:00:30.519> team<00:00:30.920> of<00:00:31.199> experts true so we assembled a team of experts true so we assembled a team of experts spanning<00:00:32.520> Harvard<00:00:33.920> MIT<00:00:34.920> the<00:00:35.079> American spanning Harvard MIT the American spanning Harvard MIT the American Heritage<00:00:36.079> dictionary<00:00:36.760> the<00:00:37.040> encyclopedia Heritage dictionary the encyclopedia Heritage dictionary the encyclopedia britanica<00:00:39.120> and<00:00:39.399> even<00:00:40.000> our<00:00:40.239> proud<00:00:40.600> sponsors britanica and even our proud sponsors britanica and even our proud sponsors the the the Google<00:00:44.320> and<00:00:44.480> we<00:00:44.719> cogitated<00:00:45.399> about<00:00:45.680> this<00:00:45.960> for Google and we cogitated about this for Google and we cogitated about this for about<00:00:46.440> four<00:00:47.280> years<00:00:48.280> and<00:00:48.960> we<00:00:49.239> came<00:00:49.879> to<00:00:50.039> a about four years and we came to a about four years and we came to a startling startling startling conclusion<00:00:52.600> ladies<00:00:52.879> and<00:00:53.039> gentlemen<00:00:53.359> a conclusion ladies and gentlemen a conclusion ladies and gentlemen a picture<00:00:53.760> is<00:00:54.000> not<00:00:54.280> worth<00:00:54.640> a<00:00:54.840> thousand<00:00:55.239> words<00:00:55.640> in picture is not worth a thousand words in picture is not worth a thousand words in fact<00:00:56.559> we<00:00:56.760> found<00:00:57.079> some<00:00:57.320> pictures<00:00:57.760> that<00:00:57.879> are fact we found some pictures that are fact we found some pictures that are worth<00:00:59.000> 500 worth 500 worth 500 billion billion billion words<00:01:02.840> so<00:01:03.039> how<00:01:03.199> did<00:01:03.320> we<00:01:03.480> get<00:01:03.559> to<00:01:03.719> this words so how did we get to this words so how did we get to this conclusion<00:01:04.680> so<00:01:04.879> Aras<00:01:05.159> and<00:01:05.239> I<00:01:05.360> were<00:01:05.479> thinking conclusion so Aras and I were thinking conclusion so Aras and I were thinking about<00:01:05.880> ways<00:01:06.119> to<00:01:06.320> get<00:01:06.720> a<00:01:06.920> big<00:01:07.159> picture<00:01:07.799> of<00:01:08.000> human about ways to get a big picture of human about ways to get a big picture of human culture<00:01:08.640> and<00:01:08.840> human<00:01:09.560> history<00:01:09.880> change<00:01:10.240> over culture and human history change over culture and human history change over time<00:01:11.159> so<00:01:11.560> many<00:01:11.840> books<00:01:12.159> actually<00:01:12.400> have<00:01:12.520> been time so many books actually have been time so many books actually have been written<00:01:13.200> over<00:01:13.360> the<00:01:13.479> years<00:01:13.680> so<00:01:13.799> we're<00:01:14.000> thinking written over the years so we're thinking written over the years so we're thinking well<00:01:14.759> the<00:01:14.880> best<00:01:15.040> way<00:01:15.159> to<00:01:15.240> learn<00:01:15.479> from<00:01:15.640> them<00:01:15.840> is well the best way to learn from them is well the best way to learn from them is to<00:01:16.200> read<00:01:16.560> all<00:01:16.720> of<00:01:16.880> these<00:01:17.000> millions<00:01:17.320> of<00:01:17.479> books to read all of these millions of books to read all of these millions of books now<00:01:18.400> of<00:01:18.520> course<00:01:18.720> if<00:01:18.840> there's<00:01:19.000> a<00:01:19.119> scale<00:01:19.360> for<00:01:19.560> how now of course if there's a scale for how now of course if there's a scale for how awesome<00:01:20.079> that<00:01:20.200> is<00:01:20.520> that<00:01:20.680> has<00:01:20.880> to<00:01:21.040> rank awesome that is that has to rank awesome that is that has to rank extremely<00:01:21.720> extremely<00:01:22.479> high<00:01:23.479> now<00:01:24.400> the<00:01:24.560> problem extremely extremely high now the problem extremely extremely high now the problem is<00:01:25.159> there's<00:01:25.360> an<00:01:25.520> x-axis<00:01:26.040> for<00:01:26.240> that<00:01:26.400> which<00:01:26.520> is<00:01:26.640> a is there's an x-axis for that which is a is there's an x-axis for that which is a practical<00:01:27.240> axis<00:01:27.960> this<00:01:28.079> is<00:01:28.240> very<00:01:28.400> very<00:01:28.920> low

now<00:01:32.960> now<00:01:33.320> people<00:01:33.600> tend<00:01:33.799> to<00:01:33.960> use<00:01:34.119> an
now now people tend to use an now now people tend to use an alternative<00:01:34.840> approach<00:01:35.479> which<00:01:35.600> is<00:01:35.759> to<00:01:35.880> take<00:01:36.040> a alternative approach which is to take a alternative approach which is to take a few<00:01:36.320> sources<00:01:36.960> and<00:01:37.079> read<00:01:37.280> them<00:01:37.399> very<00:01:37.560> carefully few sources and read them very carefully few sources and read them very carefully this<00:01:38.200> is<00:01:38.360> extremely<00:01:38.759> practical<00:01:39.159> but<00:01:39.280> not<00:01:39.439> so this is extremely practical but not so this is extremely practical but not so awesome<00:01:40.240> what<00:01:40.360> you<00:01:40.520> really<00:01:40.680> want<00:01:40.799> to<00:01:40.960> do<00:01:41.880> what awesome what you really want to do what awesome what you really want to do what you<00:01:42.119> really<00:01:42.280> want<00:01:42.360> to<00:01:42.520> do<00:01:42.880> is<00:01:43.000> to<00:01:43.159> get<00:01:43.280> to<00:01:43.399> the you really want to do is to get to the you really want to do is to get to the awesome<00:01:43.920> yet<00:01:44.119> practical<00:01:45.040> uh<00:01:45.159> part<00:01:45.320> of<00:01:45.479> this awesome yet practical uh part of this awesome yet practical uh part of this space<00:01:46.159> so<00:01:46.280> it<00:01:46.360> turns<00:01:46.560> out<00:01:46.680> there's<00:01:46.840> a<00:01:46.960> company space so it turns out there's a company space so it turns out there's a company across<00:01:47.479> the<00:01:47.600> river<00:01:47.920> called<00:01:48.159> Google<00:01:48.920> who<00:01:49.079> has across the river called Google who has across the river called Google who has started<00:01:49.520> a<00:01:49.719> digitization<00:01:50.360> project<00:01:50.640> a<00:01:50.719> few started a digitization project a few started a digitization project a few years<00:01:51.159> back<00:01:51.360> that<00:01:51.520> might<00:01:51.680> just<00:01:51.880> enable<00:01:52.240> this years back that might just enable this years back that might just enable this approach<00:01:52.799> they<00:01:52.920> have<00:01:53.040> digitized<00:01:53.479> millions<00:01:53.799> of approach they have digitized millions of approach they have digitized millions of books<00:01:54.640> so<00:01:54.799> what<00:01:54.960> that<00:01:55.079> means<00:01:55.399> is<00:01:55.759> one<00:01:55.960> could books so what that means is one could books so what that means is one could use<00:01:56.399> computational<00:01:57.000> methods<00:01:57.600> to<00:01:57.799> read<00:01:58.039> all<00:01:58.200> of use computational methods to read all of use computational methods to read all of the<00:01:58.439> books<00:01:58.680> in<00:01:58.840> the<00:01:58.960> click<00:01:59.159> of<00:01:59.240> a<00:01:59.399> button the books in the click of a button the books in the click of a button that's<00:02:00.399> very<00:02:00.600> practical<00:02:01.000> and<00:02:01.159> extremely that's very practical and extremely that's very practical and extremely awesome<00:02:03.840> let<00:02:03.960> me<00:02:04.039> tell<00:02:04.159> you<00:02:04.240> a<00:02:04.280> little<00:02:04.399> bit awesome let me tell you a little bit awesome let me tell you a little bit about<00:02:04.719> where<00:02:04.920> books<00:02:05.240> come<00:02:05.520> from<00:02:06.439> since<00:02:06.759> time about where books come from since time about where books come from since time immemorial<00:02:08.239> there<00:02:08.399> have<00:02:08.520> been<00:02:08.720> authors<00:02:09.160> these immemorial there have been authors these immemorial there have been authors these authors<00:02:09.840> have<00:02:10.000> been<00:02:10.280> striving<00:02:11.280> to<00:02:11.520> write authors have been striving to write authors have been striving to write books<00:02:11.959> and<00:02:12.120> this<00:02:12.239> became<00:02:12.560> considerably books and this became considerably books and this became considerably easier<00:02:13.560> with<00:02:13.680> the<00:02:13.800> development<00:02:14.200> of<00:02:14.319> the easier with the development of the easier with the development of the printing<00:02:14.800> press<00:02:15.080> some<00:02:15.280> centuries<00:02:15.680> ago<00:02:16.319> since printing press some centuries ago since printing press some centuries ago since then<00:02:17.160> the<00:02:17.360> authors<00:02:17.959> have<00:02:18.239> won<00:02:18.840> on<00:02:19.440> 129<00:02:20.440> million then the authors have won on 129 million then the authors have won on 129 million distinct<00:02:21.519> occasions<00:02:22.519> publishing<00:02:22.959> books<00:02:23.239> now distinct occasions publishing books now distinct occasions publishing books now if<00:02:23.480> those<00:02:23.640> books<00:02:23.879> are<00:02:24.000> not<00:02:24.120> lost<00:02:24.400> the<00:02:24.599> history if those books are not lost the history if those books are not lost the history than<00:02:25.120> they<00:02:25.200> are<00:02:25.400> somewhere<00:02:25.840> in<00:02:26.040> a<00:02:26.319> library<00:02:27.319> and than they are somewhere in a library and than they are somewhere in a library and many<00:02:27.640> of<00:02:27.800> those<00:02:27.920> books<00:02:28.120> have<00:02:28.239> been<00:02:28.440> getting many of those books have been getting many of those books have been getting retrieved<00:02:29.120> from<00:02:29.280> the<00:02:29.400> library<00:02:29.959> and<00:02:30.120> digitized retrieved from the library and digitized retrieved from the library and digitized by<00:02:30.800> Google<00:02:31.440> which<00:02:31.560> has<00:02:31.720> scanned<00:02:32.239> 15<00:02:32.640> million by Google which has scanned 15 million by Google which has scanned 15 million books<00:02:33.400> to<00:02:33.599> date<00:02:34.160> now<00:02:34.360> when<00:02:34.599> Google<00:02:35.040> digitizes books to date now when Google digitizes books to date now when Google digitizes a<00:02:35.879> book<00:02:36.120> they<00:02:36.239> put<00:02:36.400> it<00:02:36.480> into<00:02:36.640> a<00:02:36.720> really<00:02:36.879> nice a book they put it into a really nice a book they put it into a really nice format<00:02:37.400> now<00:02:37.519> we've<00:02:37.680> got<00:02:37.879> the<00:02:38.000> data<00:02:38.360> plus<00:02:38.519> we format now we've got the data plus we format now we've got the data plus we have<00:02:38.720> metadata<00:02:39.200> we<00:02:39.319> have<00:02:39.480> information<00:02:39.959> about have metadata we have information about have metadata we have information about things<00:02:40.480> like<00:02:40.840> where<00:02:41.040> was<00:02:41.159> it<00:02:41.360> published<00:02:42.000> who's things like where was it published who's things like where was it published who's the<00:02:42.440> author<00:02:43.080> when<00:02:43.280> was<00:02:43.440> it<00:02:43.599> published<00:02:44.200> and the author when was it published and the author when was it published and what<00:02:44.519> we<00:02:44.680> do<00:02:45.360> is<00:02:45.879> go<00:02:46.159> through<00:02:46.440> all<00:02:46.560> of<00:02:46.720> those what we do is go through all of those what we do is go through all of those records<00:02:47.280> and<00:02:47.519> exclude<00:02:48.400> everything<00:02:48.720> that's records and exclude everything that's records and exclude everything that's not<00:02:49.120> the<00:02:49.319> highest<00:02:49.879> quality<00:02:50.440> data<00:02:51.040> what<00:02:51.159> we're not the highest quality data what we're not the highest quality data what we're left<00:02:51.760> with<00:02:52.280> is<00:02:52.519> a<00:02:52.720> collection<00:02:53.440> of<00:02:53.920> 5<00:02:54.599> million left with is a collection of 5 million left with is a collection of 5 million books<00:02:56.360> 500<00:02:57.360> billion<00:02:58.200> words<00:02:59.200> a<00:02:59.440> string<00:03:00.000> of books 500 billion words a string of books 500 billion words a string of characters<00:03:00.800> a<00:03:00.959> thousand<00:03:01.480> times<00:03:01.840> longer<00:03:02.519> than characters a thousand times longer than characters a thousand times longer than the<00:03:02.840> human<00:03:03.200> genome<00:03:03.720> a<00:03:03.920> text<00:03:04.480> which<00:03:04.680> when the human genome a text which when the human genome a text which when written<00:03:05.159> out<00:03:05.400> would<00:03:05.680> stretch<00:03:06.120> from<00:03:06.360> here<00:03:06.640> to written out would stretch from here to written out would stretch from here to the<00:03:06.959> moon<00:03:07.360> and<00:03:07.599> back<00:03:08.239> 10<00:03:08.640> times<00:03:08.959> over<00:03:09.640> a the moon and back 10 times over a the moon and back 10 times over a veritable<00:03:10.280> Shard<00:03:11.159> of<00:03:11.280> our<00:03:11.560> cultural veritable Shard of our cultural veritable Shard of our cultural Genome<00:03:14.200> of<00:03:14.319> course<00:03:14.599> what<00:03:14.760> we<00:03:14.920> did<00:03:15.280> when<00:03:15.480> faced Genome of course what we did when faced Genome of course what we did when faced with<00:03:16.120> such<00:03:16.440> outrageous<00:03:17.480> hyperbole<00:03:18.480> was<00:03:18.680> what

any<00:03:22.159> self-respecting<00:03:23.480> researchers<00:03:24.480> would
any self-respecting researchers would any self-respecting researchers would have have have done<00:03:26.959> we<00:03:27.080> took<00:03:27.239> a<00:03:27.360> page<00:03:27.560> out<00:03:27.680> of<00:03:27.799> XKCD<00:03:28.640> and<00:03:28.760> we done we took a page out of XKCD and we done we took a page out of XKCD and we said<00:03:29.239> stand<00:03:29.599> back said stand back said stand back we're<00:03:30.920> going<00:03:31.159> to<00:03:31.439> try we're going to try we're going to try science<00:03:35.040> now<00:03:35.439> of<00:03:35.560> course<00:03:35.799> we're<00:03:36.040> thinking science now of course we're thinking science now of course we're thinking well<00:03:36.920> let's<00:03:37.159> just<00:03:37.360> first<00:03:37.640> put<00:03:37.840> the<00:03:37.959> data<00:03:38.200> out well let's just first put the data out well let's just first put the data out there<00:03:38.480> for<00:03:38.720> people<00:03:38.920> to<00:03:39.040> do<00:03:39.239> science<00:03:39.560> to<00:03:39.760> it<00:03:40.400> now there for people to do science to it now there for people to do science to it now um<00:03:41.120> we're<00:03:41.319> thinking<00:03:41.599> what<00:03:41.720> data<00:03:41.920> can<00:03:42.040> we um we're thinking what data can we um we're thinking what data can we release<00:03:42.560> well<00:03:42.720> of<00:03:42.840> course<00:03:43.040> you<00:03:43.159> want<00:03:43.280> to<00:03:43.439> take release well of course you want to take release well of course you want to take the<00:03:43.720> books<00:03:44.000> and<00:03:44.120> release<00:03:44.439> the<00:03:44.519> full<00:03:44.720> text<00:03:44.959> of the books and release the full text of the books and release the full text of these<00:03:45.200> F<00:03:45.360> millions<00:03:45.599> of<00:03:45.760> books<00:03:46.640> now<00:03:46.920> Google<00:03:47.480> and these F millions of books now Google and these F millions of books now Google and John<00:03:47.879> orand<00:03:48.239> in<00:03:48.360> particular<00:03:48.799> told<00:03:48.959> us<00:03:49.239> little John orand in particular told us little John orand in particular told us little equation<00:03:50.200> that<00:03:50.319> we<00:03:50.400> should<00:03:50.560> learn<00:03:51.080> so<00:03:51.239> we<00:03:51.319> have equation that we should learn so we have equation that we should learn so we have 5<00:03:51.640> million<00:03:51.920> books<00:03:52.400> that's<00:03:52.599> 5<00:03:52.799> million<00:03:53.040> authors 5 million books that's 5 million authors 5 million books that's 5 million authors that<00:03:53.799> is<00:03:53.959> 5<00:03:54.120> million<00:03:54.400> plaintiffs<00:03:54.840> is<00:03:54.959> a that is 5 million plaintiffs is a that is 5 million plaintiffs is a massive<00:03:55.720> lawsuit<00:03:56.720> so<00:03:57.079> although<00:03:57.400> that<00:03:57.480> would massive lawsuit so although that would massive lawsuit so although that would be<00:03:57.799> really<00:03:58.040> really<00:03:58.239> awesome<00:03:58.760> again<00:03:59.200> that's be really really awesome again that's be really really awesome again that's extremely<00:04:00.840> extremely extremely extremely extremely extremely impractical<00:04:03.000> it's<00:04:03.280> pretty<00:04:04.200> now<00:04:04.760> uh<00:04:05.040> again<00:04:05.280> we impractical it's pretty now uh again we impractical it's pretty now uh again we can<00:04:05.840> caved<00:04:06.079> in<00:04:06.239> and<00:04:06.360> we<00:04:06.480> did<00:04:06.599> the<00:04:06.760> very can caved in and we did the very can caved in and we did the very practical<00:04:07.319> approach<00:04:07.760> a<00:04:07.879> bit<00:04:08.040> less<00:04:08.200> awesome<00:04:08.760> we practical approach a bit less awesome we practical approach a bit less awesome we said<00:04:09.040> well<00:04:09.200> instead<00:04:09.360> of<00:04:09.439> releasing<00:04:09.760> the<00:04:09.799> full said well instead of releasing the full said well instead of releasing the full text<00:04:10.159> we're<00:04:10.239> going<00:04:10.319> to<00:04:10.400> release<00:04:10.720> statistics text we're going to release statistics text we're going to release statistics about<00:04:11.400> the<00:04:11.519> books<00:04:12.200> so<00:04:12.400> we're<00:04:12.519> going<00:04:12.640> to<00:04:12.760> take about the books so we're going to take about the books so we're going to take for<00:04:13.040> instance<00:04:13.519> a<00:04:13.760> glim<00:04:14.079> of<00:04:14.280> Happiness<00:04:14.760> it's for instance a glim of Happiness it's for instance a glim of Happiness it's four<00:04:15.159> words<00:04:15.480> we<00:04:15.599> call<00:04:15.760> it<00:04:15.879> a<00:04:15.959> forr<00:04:16.639> we're<00:04:16.799> going four words we call it a forr we're going four words we call it a forr we're going to<00:04:17.000> tell<00:04:17.120> you<00:04:17.239> how<00:04:17.359> many<00:04:17.519> times<00:04:17.680> a<00:04:17.840> particular to tell you how many times a particular to tell you how many times a particular forr<00:04:18.639> appeared<00:04:18.919> in<00:04:19.040> books<00:04:19.239> published<00:04:19.519> in<00:04:19.600> 1801 forr appeared in books published in 1801 forr appeared in books published in 1801 1802<00:04:20.799> 1803<00:04:21.359> all<00:04:21.479> the<00:04:21.560> way<00:04:21.680> up<00:04:21.759> to<00:04:22.040> 2008<00:04:22.960> that 1802 1803 all the way up to 2008 that 1802 1803 all the way up to 2008 that gives<00:04:23.280> us<00:04:23.400> a<00:04:23.520> Time<00:04:23.759> series<00:04:24.160> of<00:04:24.320> how<00:04:24.520> frequently gives us a Time series of how frequently gives us a Time series of how frequently this<00:04:25.120> particular<00:04:25.479> sentence<00:04:25.759> was<00:04:25.880> used<00:04:26.120> over this particular sentence was used over this particular sentence was used over time<00:04:26.960> we<00:04:27.080> do<00:04:27.280> that<00:04:27.440> for<00:04:27.639> all<00:04:27.800> the<00:04:27.919> words<00:04:28.080> and time we do that for all the words and time we do that for all the words and phrases<00:04:28.759> that<00:04:28.919> appear<00:04:29.160> in<00:04:29.280> those<00:04:29.440> books<00:04:30.199> that phrases that appear in those books that phrases that appear in those books that gives<00:04:30.520> us<00:04:30.680> a<00:04:30.840> big<00:04:31.039> table<00:04:31.320> of<00:04:31.479> two<00:04:31.680> billion gives us a big table of two billion gives us a big table of two billion lines<00:04:32.600> that<00:04:32.720> tell<00:04:32.919> us<00:04:33.080> about<00:04:33.240> the<00:04:33.320> way<00:04:33.479> culture lines that tell us about the way culture lines that tell us about the way culture has<00:04:33.919> been<00:04:34.240> changing<00:04:35.240> so<00:04:35.479> those<00:04:35.639> two<00:04:35.880> billion has been changing so those two billion has been changing so those two billion lines<00:04:36.680> we<00:04:36.840> call<00:04:36.960> them<00:04:37.120> two<00:04:37.280> billion<00:04:37.840> engrams lines we call them two billion engrams lines we call them two billion engrams what<00:04:38.960> do<00:04:39.160> they<00:04:39.320> tell<00:04:39.479> us<00:04:39.680> well<00:04:39.800> the<00:04:39.880> individual what do they tell us well the individual what do they tell us well the individual engrams<00:04:40.800> measure<00:04:41.440> cultural<00:04:42.160> Trends<00:04:42.639> let<00:04:42.759> me engrams measure cultural Trends let me engrams measure cultural Trends let me give<00:04:43.039> you<00:04:43.199> an<00:04:43.400> example<00:04:44.120> let's<00:04:44.360> suppose<00:04:44.800> that<00:04:45.000> I give you an example let's suppose that I give you an example let's suppose that I am<00:04:45.360> thriving<00:04:46.000> then<00:04:46.199> tomorrow<00:04:46.600> I<00:04:46.680> want<00:04:46.840> to<00:04:47.000> tell am thriving then tomorrow I want to tell am thriving then tomorrow I want to tell you<00:04:47.360> about<00:04:47.639> how<00:04:47.840> well<00:04:48.080> I<00:04:48.199> did<00:04:48.720> and<00:04:48.840> so<00:04:49.039> I<00:04:49.199> might you about how well I did and so I might you about how well I did and so I might say<00:04:50.120> yesterday<00:04:51.039> I<00:04:51.240> throve<00:04:52.240> alternatively<00:04:52.880> I say yesterday I throve alternatively I say yesterday I throve alternatively I could<00:04:53.120> say<00:04:53.360> yesterday<00:04:54.320> I<00:04:54.520> thrived<00:04:55.479> well<00:04:55.720> which could say yesterday I thrived well which could say yesterday I thrived well which one<00:04:56.080> should<00:04:56.280> I one should I one should I use<00:04:58.080> hm<00:04:58.520> how<00:04:58.720> to<00:04:58.960> know<00:04:59.400> well<00:04:59.720> as<00:04:59.840> of<00:05:00.000> about<00:05:00.320> 6 use hm how to know well as of about 6 use hm how to know well as of about 6 months<00:05:01.160> ago<00:05:02.160> the<00:05:02.280> State<00:05:02.520> ofthe<00:05:02.759> art<00:05:03.000> in<00:05:03.160> this months ago the State ofthe art in this months ago the State ofthe art in this field<00:05:03.600> is<00:05:03.759> that<00:05:03.880> you<00:05:03.960> would<00:05:04.280> for<00:05:04.440> instance<00:05:04.880> go field is that you would for instance go field is that you would for instance go up<00:05:05.199> to<00:05:05.440> the<00:05:05.560> following<00:05:06.039> psychologist<00:05:06.639> with up to the following psychologist with up to the following psychologist with fabulous<00:05:07.360> hair<00:05:07.960> and<00:05:08.039> you'd fabulous hair and you'd fabulous hair and you'd say<00:05:10.199> Steve<00:05:10.800> you're<00:05:11.039> an<00:05:11.320> expert<00:05:12.120> on<00:05:12.320> the say Steve you're an expert on the say Steve you're an expert on the irregular<00:05:13.000> verbs<00:05:13.680> what<00:05:13.800> should<00:05:14.000> I<00:05:14.120> do<00:05:14.440> and irregular verbs what should I do and irregular verbs what should I do and he'd<00:05:14.759> tell<00:05:14.919> you<00:05:15.199> well<00:05:15.400> most<00:05:16.080> people<00:05:16.360> say he'd tell you well most people say he'd tell you well most people say Thrive<00:05:17.199> but<00:05:17.360> some<00:05:17.560> people<00:05:17.840> say Thrive but some people say Thrive but some people say throve<00:05:19.680> now<00:05:19.880> you<00:05:20.160> also<00:05:20.560> knew<00:05:21.280> more<00:05:21.520> or<00:05:21.759> less throve now you also knew more or less throve now you also knew more or less that<00:05:22.120> if<00:05:22.199> you<00:05:22.280> were<00:05:22.400> to<00:05:22.520> go<00:05:22.720> back<00:05:22.960> in<00:05:23.240> time<00:05:23.880> 200 that if you were to go back in time 200 that if you were to go back in time 200 years<00:05:24.360> and<00:05:24.520> ask<00:05:24.759> the<00:05:24.840> following<00:05:25.120> Statesmen years and ask the following Statesmen years and ask the following Statesmen with<00:05:25.840> equally<00:05:26.280> fabulous with equally fabulous with equally fabulous hair hair hair Tom<00:05:31.360> what<00:05:31.479> should<00:05:31.680> I<00:05:31.840> say<00:05:32.160> he'd<00:05:32.360> say<00:05:32.680> well<00:05:32.880> in Tom what should I say he'd say well in Tom what should I say he'd say well in my<00:05:33.240> day<00:05:33.440> most<00:05:33.680> people<00:05:34.199> throve<00:05:35.199> but<00:05:35.400> some my day most people throve but some my day most people throve but some thrived<00:05:37.759> so<00:05:37.919> now<00:05:38.039> what<00:05:38.160> I'm<00:05:38.280> just<00:05:38.360> going<00:05:38.479> to thrived so now what I'm just going to thrived so now what I'm just going to show<00:05:38.720> you<00:05:38.880> is<00:05:39.120> raw<00:05:39.479> data<00:05:40.319> two<00:05:40.759> rows<00:05:41.280> from<00:05:41.479> this show you is raw data two rows from this show you is raw data two rows from this table<00:05:42.039> of<00:05:42.319> two<00:05:42.720> billion<00:05:43.120> entries<00:05:44.080> what<00:05:44.199> you're table of two billion entries what you're table of two billion entries what you're seeing<00:05:44.600> is<00:05:44.800> year-by-year<00:05:45.199> frequency<00:05:46.160> of seeing is year-by-year frequency of seeing is year-by-year frequency of thrived<00:05:46.840> and<00:05:47.120> throve<00:05:48.120> over thrived and throve over thrived and throve over time<00:05:50.280> now<00:05:50.800> this<00:05:50.919> is<00:05:51.160> just<00:05:51.440> two<00:05:52.000> out<00:05:52.199> of<00:05:52.479> two time now this is just two out of two time now this is just two out of two billion<00:05:53.520> rows<00:05:54.360> so<00:05:54.600> the<00:05:54.840> entire<00:05:55.639> data<00:05:56.000> set<00:05:56.520> is<00:05:56.960> a billion rows so the entire data set is a billion rows so the entire data set is a billion<00:05:57.720> times<00:05:58.039> more<00:05:58.319> awesome<00:05:58.880> than<00:05:59.120> this billion times more awesome than this billion times more awesome than this slide


now<00:06:05.880> there<00:06:06.000> are<00:06:06.120> many<00:06:06.360> other<00:06:06.560> pictures<00:06:06.880> that
now there are many other pictures that now there are many other pictures that are<00:06:07.080> worth<00:06:07.319> 500<00:06:07.680> billion<00:06:07.919> words<00:06:08.160> for<00:06:08.280> instance are worth 500 billion words for instance are worth 500 billion words for instance this<00:06:08.720> one<00:06:08.919> if<00:06:09.000> you<00:06:09.120> just<00:06:09.280> type<00:06:09.479> in<00:06:09.599> influenza this one if you just type in influenza this one if you just type in influenza you<00:06:10.440> will<00:06:10.599> see<00:06:10.880> Peaks<00:06:11.520> at<00:06:11.639> the<00:06:11.759> time<00:06:11.960> where<00:06:12.080> you you will see Peaks at the time where you you will see Peaks at the time where you knew<00:06:12.759> big<00:06:13.120> flu<00:06:13.440> epidemics<00:06:13.919> were<00:06:14.120> actually knew big flu epidemics were actually knew big flu epidemics were actually killing<00:06:14.560> millions<00:06:14.840> of<00:06:14.960> people<00:06:15.199> around<00:06:15.400> the killing millions of people around the killing millions of people around the globe<00:06:17.240> if<00:06:17.440> you<00:06:17.599> were<00:06:17.840> not<00:06:18.080> yet<00:06:18.440> convinced<00:06:19.440> sea globe if you were not yet convinced sea globe if you were not yet convinced sea levels<00:06:20.120> are<00:06:20.400> rising<00:06:21.360> so<00:06:21.599> is<00:06:21.840> atmospheric<00:06:22.599> CO2 levels are rising so is atmospheric CO2 levels are rising so is atmospheric CO2 and<00:06:23.840> Global<00:06:24.360> temperature<00:06:25.360> you<00:06:25.520> might<00:06:25.720> also and Global temperature you might also and Global temperature you might also want<00:06:26.039> to<00:06:26.199> have<00:06:26.319> a<00:06:26.440> look<00:06:26.639> at<00:06:26.880> this<00:06:27.039> particular want to have a look at this particular want to have a look at this particular engram<00:06:27.759> and<00:06:27.960> ask<00:06:28.280> and<00:06:28.400> tell<00:06:28.599> Nichi<00:06:29.000> that<00:06:29.120> God engram and ask and tell Nichi that God engram and ask and tell Nichi that God is<00:06:29.360> Not<00:06:29.759> dead<00:06:30.280> although<00:06:30.560> you<00:06:30.720> might<00:06:31.199> agree is Not dead although you might agree is Not dead although you might agree that<00:06:31.599> he<00:06:31.720> might<00:06:31.840> need<00:06:32.000> a<00:06:32.120> better<00:06:32.360> publicist that he might need a better publicist that he might need a better publicist yes<00:06:35.800> you<00:06:35.919> can<00:06:36.039> get<00:06:36.240> some<00:06:36.360> pretty<00:06:36.599> abstract yes you can get some pretty abstract yes you can get some pretty abstract concepts<00:06:37.759> with<00:06:37.960> this<00:06:38.080> sort<00:06:38.280> of<00:06:38.440> thing<00:06:38.680> for concepts with this sort of thing for concepts with this sort of thing for instance<00:06:39.120> let<00:06:39.240> me<00:06:39.360> tell<00:06:39.520> you<00:06:39.720> the<00:06:39.880> history<00:06:40.240> of instance let me tell you the history of instance let me tell you the history of the<00:06:40.560> year the year the year 1950<00:06:42.800> pretty<00:06:43.039> much<00:06:43.240> for<00:06:43.680> the<00:06:43.919> vast<00:06:44.120> majority 1950 pretty much for the vast majority 1950 pretty much for the vast majority of<00:06:44.560> history<00:06:44.800> no<00:06:44.880> one<00:06:45.039> gave<00:06:45.199> a<00:06:45.319> damn<00:06:45.639> about<00:06:45.800> 1950 of history no one gave a damn about 1950 of history no one gave a damn about 1950 in<00:06:46.680> 1700<00:06:47.400> and<00:06:47.599> 1800<00:06:48.160> and<00:06:48.280> 1900<00:06:49.280> no<00:06:49.440> one

cared<00:06:53.240> through<00:06:53.639> the<00:06:53.800> 30s<00:06:54.160> and<00:06:54.360> 40s<00:06:55.240> no<00:06:55.400> one
cared through the 30s and 40s no one cared through the 30s and 40s no one cared<00:06:56.479> suddenly<00:06:57.319> in<00:06:57.479> the<00:06:57.680> mid<00:06:58.039> 40s<00:06:59.000> there cared suddenly in the mid 40s there cared suddenly in the mid 40s there stting<00:06:59.680> to<00:06:59.759> be<00:06:59.840> a<00:06:59.919> buzz<00:07:00.319> people<00:07:00.560> realized<00:07:01.000> that stting to be a buzz people realized that stting to be a buzz people realized that 1950<00:07:01.960> was<00:07:02.160> going<00:07:02.319> to<00:07:02.560> happen<00:07:03.160> and<00:07:03.360> it<00:07:03.560> could<00:07:03.800> be

big<00:07:07.680> but<00:07:08.520> nothing<00:07:09.240> got<00:07:09.599> people<00:07:09.960> interested<00:07:10.479> in
big but nothing got people interested in big but nothing got people interested in 1950<00:07:11.800> like<00:07:12.039> the<00:07:12.240> year

1950<00:07:16.440> people<00:07:16.720> were<00:07:17.000> walking<00:07:17.520> around<00:07:18.120> obsessed
1950 people were walking around obsessed 1950 people were walking around obsessed they<00:07:19.039> couldn't<00:07:19.360> stop<00:07:19.759> talking<00:07:20.280> about<00:07:20.520> all<00:07:20.800> the they couldn't stop talking about all the they couldn't stop talking about all the things<00:07:21.599> they<00:07:21.800> did<00:07:22.199> in<00:07:22.639> 1950<00:07:23.639> all<00:07:23.800> the<00:07:23.960> things things they did in 1950 all the things things they did in 1950 all the things they<00:07:24.520> were<00:07:24.840> planning<00:07:25.319> to<00:07:25.520> do<00:07:25.879> in<00:07:26.039> 1950<00:07:26.879> all<00:07:27.120> the they were planning to do in 1950 all the they were planning to do in 1950 all the dreams<00:07:27.639> of<00:07:27.879> what<00:07:28.000> they<00:07:28.160> wanted<00:07:28.479> to<00:07:28.680> accomplish dreams of what they wanted to accomplish dreams of what they wanted to accomplish in in in 1950<00:07:31.479> in<00:07:31.639> fact<00:07:31.840> 1950<00:07:32.520> was<00:07:32.680> so<00:07:33.000> fascinating 1950 in fact 1950 was so fascinating 1950 in fact 1950 was so fascinating that<00:07:33.759> for<00:07:34.080> years<00:07:34.720> thereafter<00:07:35.520> people<00:07:35.840> just that for years thereafter people just that for years thereafter people just kept<00:07:36.400> talking<00:07:36.720> about<00:07:36.879> all<00:07:37.000> the<00:07:37.160> amazing kept talking about all the amazing kept talking about all the amazing things<00:07:37.800> that<00:07:37.960> happened<00:07:38.240> in<00:07:38.400> 51<00:07:39.199> 52<00:07:40.280> 53<00:07:41.280> finally things that happened in 51 52 53 finally things that happened in 51 52 53 finally in<00:07:42.120> 1954<00:07:43.120> someone<00:07:43.560> woke<00:07:43.879> up<00:07:44.159> and<00:07:44.759> realized in 1954 someone woke up and realized in 1954 someone woke up and realized that<00:07:45.479> 1950<00:07:46.159> had<00:07:46.360> gotten<00:07:46.759> somewhat<00:07:47.479> p that 1950 had gotten somewhat p that 1950 had gotten somewhat p a<00:07:50.639> and<00:07:50.919> just<00:07:51.199> like<00:07:51.520> that<00:07:51.840> the<00:07:52.000> bubble a and just like that the bubble a and just like that the bubble burst<00:07:54.720> now<00:07:54.879> the<00:07:54.960> story<00:07:55.199> of<00:07:55.280> 1950<00:07:55.840> is<00:07:55.960> the<00:07:56.080> story burst now the story of 1950 is the story burst now the story of 1950 is the story of<00:07:56.440> every<00:07:56.720> year<00:07:57.120> that<00:07:57.280> we<00:07:57.479> have<00:07:57.840> on<00:07:58.120> record of every year that we have on record of every year that we have on record with<00:07:58.879> a<00:07:59.000> little<00:07:59.159> twist with a little twist with a little twist because<00:08:00.319> now<00:08:00.479> we've<00:08:00.680> got<00:08:00.840> these<00:08:01.039> nice<00:08:01.400> charts because now we've got these nice charts because now we've got these nice charts and<00:08:02.319> because<00:08:02.479> we<00:08:02.599> have<00:08:02.680> these<00:08:02.840> nice<00:08:03.039> charts<00:08:03.400> we and because we have these nice charts we and because we have these nice charts we can<00:08:03.639> measure<00:08:04.000> things<00:08:04.280> we<00:08:04.360> can<00:08:04.560> say<00:08:05.039> well<00:08:05.199> how can measure things we can say well how can measure things we can say well how fast<00:08:05.680> does<00:08:05.879> the<00:08:06.000> bubble<00:08:06.360> burst<00:08:06.879> and<00:08:07.000> it<00:08:07.120> turns fast does the bubble burst and it turns fast does the bubble burst and it turns out<00:08:07.560> that<00:08:07.680> we<00:08:07.759> can<00:08:07.919> measure<00:08:08.280> that<00:08:08.560> very out that we can measure that very out that we can measure that very precisely<00:08:09.800> equations<00:08:10.319> were<00:08:10.639> derived<00:08:11.639> graphs precisely equations were derived graphs precisely equations were derived graphs were<00:08:12.479> produced<00:08:13.240> and<00:08:13.360> the<00:08:13.560> net<00:08:13.919> result<00:08:14.879> is<00:08:15.080> that were produced and the net result is that were produced and the net result is that we<00:08:15.639> find<00:08:16.080> that<00:08:16.280> the<00:08:16.440> bubble<00:08:16.800> bursts<00:08:17.479> faster we find that the bubble bursts faster we find that the bubble bursts faster and<00:08:18.280> faster<00:08:18.759> with<00:08:18.960> each<00:08:19.240> passing<00:08:19.680> year<00:08:20.159> we<00:08:20.280> are and faster with each passing year we are and faster with each passing year we are losing<00:08:21.080> interest<00:08:21.440> in<00:08:21.599> the<00:08:21.879> past<00:08:22.440> more losing interest in the past more losing interest in the past more rapidly<00:08:24.520> now<00:08:24.720> a<00:08:24.800> little<00:08:25.000> piece<00:08:25.159> of<00:08:25.319> career rapidly now a little piece of career rapidly now a little piece of career advice<00:08:26.479> so<00:08:26.759> for<00:08:26.960> those<00:08:27.120> of<00:08:27.240> you<00:08:27.400> who<00:08:27.560> seek<00:08:27.759> to advice so for those of you who seek to advice so for those of you who seek to be<00:08:28.000> famous<00:08:28.319> you<00:08:28.400> can<00:08:28.520> learn<00:08:28.759> from<00:08:28.919> the<00:08:29.000> most be famous you can learn from the most be famous you can learn from the most famous<00:08:29.759> 25<00:08:30.159> most<00:08:30.319> famous<00:08:30.639> political<00:08:31.039> figures famous 25 most famous political figures famous 25 most famous political figures authors<00:08:32.039> actors<00:08:32.360> and<00:08:32.479> so<00:08:32.640> on<00:08:33.080> so<00:08:33.240> if<00:08:33.279> you<00:08:33.399> want authors actors and so on so if you want authors actors and so on so if you want to<00:08:33.680> become<00:08:33.959> famous<00:08:34.279> early<00:08:34.519> on<00:08:34.760> you<00:08:34.839> should<00:08:35.000> be to become famous early on you should be to become famous early on you should be an<00:08:35.279> actor<00:08:35.919> because<00:08:36.080> then<00:08:36.320> Fame<00:08:36.640> starts<00:08:36.959> Rising an actor because then Fame starts Rising an actor because then Fame starts Rising by<00:08:37.360> the<00:08:37.479> end<00:08:37.599> of<00:08:37.680> your<00:08:37.800> 20s<00:08:38.159> you're<00:08:38.360> still by the end of your 20s you're still by the end of your 20s you're still young<00:08:38.800> it's<00:08:38.959> really<00:08:39.200> great<00:08:39.760> now<00:08:39.880> if<00:08:39.959> you<00:08:40.039> can young it's really great now if you can young it's really great now if you can wait<00:08:40.399> a<00:08:40.519> little<00:08:40.680> bit<00:08:40.919> you<00:08:41.000> should<00:08:41.159> be<00:08:41.279> an wait a little bit you should be an wait a little bit you should be an author<00:08:41.800> because<00:08:42.000> then<00:08:42.120> you<00:08:42.240> rise<00:08:42.479> to<00:08:42.719> very author because then you rise to very author because then you rise to very great<00:08:43.599> Heights<00:08:43.919> like<00:08:44.039> Mark<00:08:44.279> Twain<00:08:44.600> for great Heights like Mark Twain for great Heights like Mark Twain for instance<00:08:44.920> is<00:08:45.000> extremely<00:08:45.480> famous<00:08:46.480> but<00:08:46.880> if<00:08:47.000> you instance is extremely famous but if you instance is extremely famous but if you want<00:08:47.320> to<00:08:47.480> reach<00:08:47.720> the<00:08:47.880> very<00:08:48.120> top<00:08:48.320> you<00:08:48.399> should want to reach the very top you should want to reach the very top you should delay<00:08:48.920> gratification<00:08:49.680> and<00:08:49.800> of<00:08:49.920> course<00:08:50.120> become delay gratification and of course become delay gratification and of course become a<00:08:50.440> politician<00:08:51.200> right<00:08:51.760> so<00:08:52.000> here<00:08:52.279> you<00:08:52.399> will a politician right so here you will a politician right so here you will become<00:08:53.040> famous<00:08:53.360> by<00:08:53.480> the<00:08:53.560> end<00:08:53.640> of<00:08:53.760> your<00:08:53.839> 50s<00:08:54.160> and become famous by the end of your 50s and become famous by the end of your 50s and become<00:08:54.560> very<00:08:54.720> very<00:08:54.880> famous<00:08:55.200> afterwards<00:08:56.000> so become very very famous afterwards so become very very famous afterwards so scientists<00:08:56.760> also<00:08:56.959> tend<00:08:57.160> to<00:08:57.279> get<00:08:57.399> famous<00:08:57.720> when scientists also tend to get famous when scientists also tend to get famous when they're<00:08:58.279> much<00:08:58.640> much<00:08:58.959> more<00:08:59.160> old<00:08:59.600> like<00:08:59.680> for they're much much more old like for they're much much more old like for instance<00:09:00.000> biologists<00:09:00.440> and<00:09:00.560> physicists<00:09:01.040> can instance biologists and physicists can instance biologists and physicists can be<00:09:01.360> almost<00:09:01.600> as<00:09:01.720> famous<00:09:02.040> as<00:09:02.200> actors<00:09:03.000> one be almost as famous as actors one be almost as famous as actors one mistake<00:09:03.519> you<00:09:03.600> should<00:09:03.839> not<00:09:04.079> do<00:09:04.560> is<00:09:04.760> become<00:09:04.959> a mistake you should not do is become a mistake you should not do is become a mathematician<00:09:07.440> if<00:09:07.640> you<00:09:08.079> if<00:09:08.200> you<00:09:08.360> do<00:09:08.640> that<00:09:09.160> you mathematician if you if you do that you mathematician if you if you do that you might<00:09:09.600> think<00:09:10.040> oh<00:09:10.200> great<00:09:10.399> I'm<00:09:10.519> going<00:09:10.600> to<00:09:10.680> do<00:09:10.800> my might think oh great I'm going to do my might think oh great I'm going to do my best<00:09:11.079> work<00:09:11.279> when<00:09:11.399> I'm<00:09:11.480> in<00:09:11.640> my<00:09:11.800> 20s<00:09:12.640> but<00:09:12.839> guess best work when I'm in my 20s but guess best work when I'm in my 20s but guess what<00:09:13.200> nobody<00:09:13.480> will<00:09:13.680> really

care<00:09:19.279> there<00:09:19.399> are<00:09:19.600> more<00:09:19.839> sobering<00:09:20.440> notes<00:09:21.040> among
care there are more sobering notes among care there are more sobering notes among the<00:09:21.399> engrams<00:09:22.000> for<00:09:22.120> instance<00:09:22.440> here's<00:09:22.640> the the engrams for instance here's the the engrams for instance here's the trajectory<00:09:23.240> of<00:09:23.399> Mark<00:09:23.600> shagal<00:09:24.200> an<00:09:24.320> artist<00:09:24.680> born trajectory of Mark shagal an artist born trajectory of Mark shagal an artist born in<00:09:25.120> 1887<00:09:26.120> and<00:09:26.320> this<00:09:26.519> looks<00:09:26.760> like<00:09:26.920> the<00:09:27.079> normal in 1887 and this looks like the normal in 1887 and this looks like the normal trajectory<00:09:27.920> of<00:09:28.000> a<00:09:28.160> famous<00:09:28.640> person<00:09:29.120> he<00:09:29.440> gets trajectory of a famous person he gets trajectory of a famous person he gets more<00:09:29.920> and<00:09:30.079> more<00:09:30.320> and<00:09:30.480> more<00:09:30.680> and<00:09:30.839> more<00:09:31.360> famous more and more and more and more famous more and more and more and more famous um<00:09:32.680> except<00:09:33.120> if<00:09:33.200> you<00:09:33.360> look<00:09:33.560> in<00:09:33.800> German<00:09:34.560> if<00:09:34.600> you um except if you look in German if you um except if you look in German if you look<00:09:34.839> in<00:09:34.959> German<00:09:35.240> you<00:09:35.320> see<00:09:35.600> something look in German you see something look in German you see something completely<00:09:36.519> bizarre<00:09:36.959> something<00:09:37.200> you<00:09:37.480> pretty completely bizarre something you pretty completely bizarre something you pretty much<00:09:37.959> never<00:09:38.240> see<00:09:39.240> uh<00:09:39.399> which<00:09:39.560> is<00:09:39.720> he<00:09:40.079> becomes much never see uh which is he becomes much never see uh which is he becomes extremely<00:09:40.920> famous<00:09:41.279> and<00:09:41.399> then<00:09:41.519> all<00:09:41.640> of<00:09:41.720> a extremely famous and then all of a extremely famous and then all of a sudden<00:09:42.040> plummets<00:09:42.519> going<00:09:42.680> through<00:09:42.839> a<00:09:43.000> nater sudden plummets going through a nater sudden plummets going through a nater between<00:09:44.000> 1933<00:09:44.720> and<00:09:45.120> 1945<00:09:46.120> before<00:09:47.120> rebounding between 1933 and 1945 before rebounding between 1933 and 1945 before rebounding afterwards<00:09:48.600> and<00:09:48.760> of<00:09:48.920> course<00:09:49.160> what<00:09:49.279> we're afterwards and of course what we're afterwards and of course what we're seeing<00:09:50.399> is<00:09:50.640> the<00:09:50.880> fact<00:09:51.160> that<00:09:51.480> Mark<00:09:51.760> shagal<00:09:52.240> was seeing is the fact that Mark shagal was seeing is the fact that Mark shagal was a<00:09:52.480> Jewish<00:09:52.880> artist<00:09:53.640> in<00:09:53.839> Nazi a Jewish artist in Nazi a Jewish artist in Nazi Germany<00:09:55.839> now<00:09:56.399> these<00:09:56.680> signals<00:09:57.160> are<00:09:57.480> actually Germany now these signals are actually Germany now these signals are actually so<00:09:58.600> strong<00:09:59.200> that<00:09:59.600> that<00:10:00.600> we<00:10:00.720> don't<00:10:01.000> need<00:10:01.240> to so strong that that we don't need to so strong that that we don't need to know<00:10:02.320> that<00:10:02.519> someone<00:10:02.880> was<00:10:03.040> censored<00:10:03.480> we<00:10:03.600> can know that someone was censored we can know that someone was censored we can actually<00:10:04.040> figure<00:10:04.279> it<00:10:04.399> out<00:10:04.720> using<00:10:05.120> really actually figure it out using really actually figure it out using really basic<00:10:05.839> signal<00:10:06.279> processing<00:10:06.760> here's<00:10:07.040> a<00:10:07.200> simple basic signal processing here's a simple basic signal processing here's a simple way<00:10:07.720> to<00:10:07.880> do<00:10:08.040> it<00:10:08.880> uh<00:10:09.040> well<00:10:09.320> a<00:10:09.440> reasonable way to do it uh well a reasonable way to do it uh well a reasonable expectation<00:10:10.640> is<00:10:10.800> that<00:10:10.959> somebody's<00:10:11.440> Fame<00:10:11.640> in<00:10:11.760> a expectation is that somebody's Fame in a expectation is that somebody's Fame in a given<00:10:12.079> period<00:10:12.320> of<00:10:12.440> time<00:10:12.600> should<00:10:12.760> be<00:10:12.920> roughly given period of time should be roughly given period of time should be roughly the<00:10:13.320> average<00:10:13.560> of<00:10:13.680> their<00:10:13.839> Fame<00:10:14.120> before<00:10:14.839> and the average of their Fame before and the average of their Fame before and their<00:10:15.160> Fame<00:10:15.640> after<00:10:16.279> so<00:10:16.480> that's<00:10:16.600> sort<00:10:16.760> of<00:10:16.880> what their Fame after so that's sort of what their Fame after so that's sort of what we<00:10:17.240> expect<00:10:18.160> and<00:10:18.360> we<00:10:18.519> compare<00:10:18.920> that<00:10:19.160> to<00:10:19.519> the we expect and we compare that to the we expect and we compare that to the fame<00:10:19.920> that<00:10:20.040> we<00:10:20.480> observe<00:10:21.480> and<00:10:21.800> we<00:10:21.959> just<00:10:22.240> divide fame that we observe and we just divide fame that we observe and we just divide one<00:10:22.880> by<00:10:23.040> the<00:10:23.120> other<00:10:23.360> to<00:10:23.480> produce<00:10:23.760> something<00:10:23.959> we one by the other to produce something we one by the other to produce something we call<00:10:24.240> a<00:10:24.399> suppression<00:10:24.920> index<00:10:25.320> if<00:10:25.440> the call a suppression index if the call a suppression index if the suppression<00:10:26.000> index<00:10:26.320> is<00:10:26.519> very<00:10:26.760> very<00:10:26.959> very suppression index is very very very suppression index is very very very small<00:10:28.079> then<00:10:28.920> you<00:10:29.079> very<00:10:29.600> might<00:10:29.720> be<00:10:29.839> being small then you very might be being small then you very might be being suppressed<00:10:30.600> if<00:10:30.680> it's<00:10:30.880> very<00:10:31.160> large<00:10:31.760> maybe suppressed if it's very large maybe suppressed if it's very large maybe you're<00:10:32.079> benefiting<00:10:32.519> from you're benefiting from you're benefiting from propaganda<00:10:35.160> now<00:10:35.480> you<00:10:35.639> can<00:10:35.839> actually<00:10:36.279> look<00:10:36.440> at propaganda now you can actually look at propaganda now you can actually look at the<00:10:36.760> distribution<00:10:37.160> of<00:10:37.320> separtion<00:10:37.760> indices the distribution of separtion indices the distribution of separtion indices over<00:10:38.800> a<00:10:38.959> whole<00:10:39.279> population<00:10:39.720> so<00:10:39.839> for<00:10:40.000> instance over a whole population so for instance over a whole population so for instance here<00:10:41.000> uh<00:10:41.120> distribution<00:10:41.600> indices<00:10:42.040> for<00:10:42.240> 5,000 here uh distribution indices for 5,000 here uh distribution indices for 5,000 people<00:10:42.959> picked<00:10:43.240> in<00:10:43.399> the<00:10:43.680> English<00:10:44.000> books<00:10:44.240> where people picked in the English books where people picked in the English books where there's<00:10:44.560> no<00:10:44.680> known<00:10:44.880> suppression<00:10:45.839> would<00:10:46.000> be there's no known suppression would be there's no known suppression would be like<00:10:46.320> this<00:10:46.519> basically<00:10:46.880> tightly<00:10:47.160> centered like this basically tightly centered like this basically tightly centered around<00:10:47.760> one<00:10:48.079> what<00:10:48.200> you<00:10:48.440> expect<00:10:48.720> is<00:10:48.839> basically around one what you expect is basically around one what you expect is basically What<00:10:49.320> You<00:10:49.440> observe<00:10:50.320> this<00:10:50.399> is<00:10:50.519> a<00:10:50.639> distribution What You observe this is a distribution What You observe this is a distribution you<00:10:51.120> see<00:10:51.240> in<00:10:51.320> Nazi<00:10:51.560> Germany<00:10:52.000> it's<00:10:52.160> very you see in Nazi Germany it's very you see in Nazi Germany it's very different<00:10:52.600> it's<00:10:52.720> shifted<00:10:53.040> to<00:10:53.120> the<00:10:53.320> left different it's shifted to the left different it's shifted to the left people<00:10:53.920> are<00:10:54.079> talked<00:10:54.320> about<00:10:54.519> tce<00:10:54.920> less<00:10:55.079> as<00:10:55.160> it people are talked about tce less as it people are talked about tce less as it should<00:10:55.480> have<00:10:55.639> been<00:10:56.160> but<00:10:56.320> much<00:10:56.480> more should have been but much more should have been but much more importantly<00:10:57.040> the<00:10:57.120> distribution<00:10:57.519> is<00:10:57.720> much importantly the distribution is much importantly the distribution is much wider<00:10:58.639> there<00:10:58.720> are<00:10:58.839> many<00:10:59.000> people<00:10:59.399> who<00:10:59.480> end<00:10:59.639> up wider there are many people who end up wider there are many people who end up on<00:10:59.880> the<00:11:00.000> far<00:11:00.200> left<00:11:00.480> of<00:11:00.600> this<00:11:00.760> distribution<00:11:01.560> who on the far left of this distribution who on the far left of this distribution who are<00:11:01.800> talked<00:11:02.079> about<00:11:02.399> 10<00:11:02.680> times<00:11:02.959> fewer<00:11:03.519> than are talked about 10 times fewer than are talked about 10 times fewer than they<00:11:03.760> should<00:11:03.959> have<00:11:04.120> been<00:11:04.560> and<00:11:04.720> then<00:11:04.880> also<00:11:05.120> many they should have been and then also many they should have been and then also many people<00:11:05.760> on<00:11:05.920> the<00:11:06.200> far<00:11:06.440> right<00:11:06.720> who<00:11:06.839> seem<00:11:07.000> to people on the far right who seem to people on the far right who seem to benefit<00:11:07.519> for<00:11:07.720> propaganda<00:11:08.560> this<00:11:08.720> picture<00:11:09.040> here benefit for propaganda this picture here benefit for propaganda this picture here is<00:11:09.320> the<00:11:09.399> Hallmark<00:11:09.800> of<00:11:09.920> censorship<00:11:10.440> in<00:11:10.560> the is the Hallmark of censorship in the is the Hallmark of censorship in the book book book record<00:11:13.120> so<00:11:13.360> culturomic<00:11:14.240> is<00:11:14.399> what<00:11:14.519> we<00:11:14.680> call record so culturomic is what we call record so culturomic is what we call this<00:11:15.079> method<00:11:15.399> it's<00:11:15.560> kind<00:11:15.720> of<00:11:15.920> like<00:11:16.200> genomics this method it's kind of like genomics this method it's kind of like genomics except<00:11:17.079> genomics<00:11:17.639> is<00:11:17.839> kind<00:11:17.959> of<00:11:18.040> a<00:11:18.200> lens<00:11:18.519> on except genomics is kind of a lens on except genomics is kind of a lens on biology<00:11:19.560> through<00:11:20.040> the<00:11:20.240> window<00:11:21.120> of<00:11:21.279> the biology through the window of the biology through the window of the sequence<00:11:21.800> of<00:11:22.000> bases<00:11:22.320> in<00:11:22.480> the<00:11:22.560> human<00:11:22.839> genome sequence of bases in the human genome sequence of bases in the human genome culturomic<00:11:23.959> is<00:11:24.120> similar<00:11:24.480> it's<00:11:24.600> the culturomic is similar it's the culturomic is similar it's the application<00:11:25.120> of<00:11:25.320> massive<00:11:25.639> scale<00:11:26.040> data application of massive scale data application of massive scale data collection<00:11:26.800> analysis<00:11:27.519> to<00:11:27.680> the<00:11:27.800> study<00:11:28.079> of collection analysis to the study of collection analysis to the study of human<00:11:28.560> culture<00:11:29.320> here<00:11:29.519> instead<00:11:29.760> of<00:11:29.839> through human culture here instead of through human culture here instead of through the<00:11:30.079> lens<00:11:30.279> of<00:11:30.399> a<00:11:30.519> genome<00:11:31.279> through<00:11:31.480> the<00:11:31.680> lens<00:11:32.240> of the lens of a genome through the lens of the lens of a genome through the lens of digitized<00:11:33.399> pieces<00:11:33.800> of<00:11:33.920> the<00:11:34.120> historical digitized pieces of the historical digitized pieces of the historical record<00:11:35.480> the<00:11:35.600> great<00:11:35.760> thing<00:11:35.880> about<00:11:36.079> cultur<00:11:36.399> roic record the great thing about cultur roic record the great thing about cultur roic is<00:11:37.000> that<00:11:37.519> everyone<00:11:38.000> can<00:11:38.200> do<00:11:38.360> it<00:11:38.839> why<00:11:39.000> can is that everyone can do it why can is that everyone can do it why can everyone<00:11:39.440> do<00:11:39.600> it<00:11:39.760> everyone<00:11:40.040> can<00:11:40.160> do<00:11:40.320> it everyone do it everyone can do it everyone do it everyone can do it because<00:11:41.279> uh<00:11:41.440> three<00:11:41.680> guys<00:11:41.880> John<00:11:42.120> orwant<00:11:42.800> Matt because uh three guys John orwant Matt because uh three guys John orwant Matt Gray<00:11:43.560> and<00:11:43.720> will<00:11:43.920> Brockman<00:11:44.399> over<00:11:44.560> at<00:11:44.720> Google Gray and will Brockman over at Google Gray and will Brockman over at Google saw<00:11:46.079> the<00:11:46.360> Prototype<00:11:46.800> of<00:11:46.880> the<00:11:46.959> engram<00:11:47.320> viewer saw the Prototype of the engram viewer saw the Prototype of the engram viewer and<00:11:47.639> they<00:11:47.760> said<00:11:48.079> this<00:11:48.240> is<00:11:48.480> so<00:11:48.800> fun<00:11:49.760> uh<00:11:50.040> we<00:11:50.360> have and they said this is so fun uh we have and they said this is so fun uh we have to<00:11:51.240> uh<00:11:51.360> make<00:11:51.519> this<00:11:51.680> available<00:11:52.000> for<00:11:52.160> people<00:11:52.320> and to uh make this available for people and to uh make this available for people and so<00:11:52.560> in<00:11:52.680> two<00:11:52.880> weeks<00:11:53.079> flat<00:11:53.440> the<00:11:53.519> two<00:11:53.639> weeks so in two weeks flat the two weeks so in two weeks flat the two weeks before<00:11:54.040> our<00:11:54.200> paper<00:11:54.480> came<00:11:54.600> out<00:11:55.000> they<00:11:55.120> coded<00:11:55.399> up before our paper came out they coded up before our paper came out they coded up a<00:11:55.600> version<00:11:55.760> of<00:11:55.839> the<00:11:55.920> engram<00:11:56.240> viewer<00:11:56.519> for<00:11:56.720> the a version of the engram viewer for the a version of the engram viewer for the general<00:11:57.320> public<00:11:57.639> and<00:11:57.760> so<00:11:57.959> you<00:11:58.160> too<00:11:58.639> can<00:11:58.800> type general public and so you too can type general public and so you too can type in<00:11:59.240> any<00:11:59.399> word<00:11:59.560> or<00:11:59.720> phrase<00:12:00.000> that<00:12:00.079> you're in any word or phrase that you're in any word or phrase that you're interested<00:12:00.600> in<00:12:01.040> and<00:12:01.120> see<00:12:01.320> its<00:12:01.440> engram interested in and see its engram interested in and see its engram immediately<00:12:02.279> and<00:12:02.480> also<00:12:02.920> browse<00:12:03.360> examples<00:12:03.760> of immediately and also browse examples of immediately and also browse examples of all<00:12:04.079> the<00:12:04.240> various<00:12:04.680> books<00:12:05.320> in<00:12:05.440> which<00:12:05.560> your all the various books in which your all the various books in which your engram<00:12:06.240> appears<00:12:07.200> now<00:12:07.399> this<00:12:07.480> was<00:12:07.600> used<00:12:07.839> over<00:12:08.000> a engram appears now this was used over a engram appears now this was used over a million<00:12:08.360> times<00:12:08.560> in<00:12:08.680> the<00:12:08.800> first<00:12:09.040> day<00:12:09.200> and<00:12:09.320> this million times in the first day and this million times in the first day and this is<00:12:09.519> really<00:12:09.720> the<00:12:09.839> be<00:12:10.200> of<00:12:10.320> all<00:12:10.440> the<00:12:10.560> queries is really the be of all the queries is really the be of all the queries right<00:12:11.519> so<00:12:11.680> people<00:12:11.880> want<00:12:12.000> to<00:12:12.160> be<00:12:12.320> their<00:12:12.519> best right so people want to be their best right so people want to be their best put<00:12:12.920> their<00:12:13.040> best<00:12:13.160> food<00:12:13.399> forward<00:12:13.680> but<00:12:13.800> it<00:12:13.880> turns put their best food forward but it turns put their best food forward but it turns out<00:12:14.320> in<00:12:14.399> the<00:12:14.519> 18th<00:12:14.839> century<00:12:15.199> people<00:12:15.440> didn't out in the 18th century people didn't out in the 18th century people didn't really<00:12:15.760> care<00:12:15.959> about<00:12:16.199> that<00:12:16.760> at<00:12:16.880> all<00:12:17.160> they really care about that at all they really care about that at all they didn't<00:12:17.519> want<00:12:17.680> to<00:12:17.839> be<00:12:18.040> their<00:12:18.199> best<00:12:18.360> they<00:12:18.480> want didn't want to be their best they want didn't want to be their best they want to<00:12:18.680> be<00:12:18.800> their<00:12:18.959> be<00:12:19.560> so<00:12:19.720> what<00:12:19.839> happens<00:12:20.240> is<00:12:20.959> of to be their be so what happens is of to be their be so what happens is of course<00:12:21.920> this<00:12:22.040> is<00:12:22.160> just<00:12:22.279> a<00:12:22.440> mistake<00:12:22.839> right<00:12:22.959> it's course this is just a mistake right it's course this is just a mistake right it's not<00:12:23.279> that<00:12:23.360> they<00:12:23.480> stbe<00:12:23.720> for<00:12:23.839> mediocrity<00:12:24.399> is not that they stbe for mediocrity is not that they stbe for mediocrity is just<00:12:24.880> that<00:12:25.199> the<00:12:25.399> S<00:12:25.600> used<00:12:25.800> to<00:12:25.880> be<00:12:26.040> written just that the S used to be written just that the S used to be written differently<00:12:27.040> kind<00:12:27.199> of<00:12:27.360> like<00:12:27.480> a<00:12:27.639> f<00:12:28.279> now<00:12:28.440> of differently kind of like a f now of differently kind of like a f now of course<00:12:28.720> Google's<00:12:29.279> I<00:12:29.399> didn't<00:12:29.560> pick<00:12:29.800> this<00:12:29.920> up<00:12:30.320> at course Google's I didn't pick this up at course Google's I didn't pick this up at the<00:12:30.639> time<00:12:31.120> so<00:12:31.360> we<00:12:31.480> know<00:12:31.720> we<00:12:31.920> reported<00:12:32.320> this<00:12:32.399> in the time so we know we reported this in the time so we know we reported this in the<00:12:32.600> sence<00:12:32.959> article<00:12:33.240> that<00:12:33.360> we<00:12:33.480> wrote<00:12:33.959> uh<00:12:34.040> but the sence article that we wrote uh but the sence article that we wrote uh but it<00:12:34.240> turns<00:12:34.440> out<00:12:34.600> that<00:12:34.760> this<00:12:34.839> should<00:12:35.040> just<00:12:35.160> stand it turns out that this should just stand it turns out that this should just stand as<00:12:35.519> a<00:12:35.639> reminder<00:12:36.120> that<00:12:36.399> although<00:12:36.680> this<00:12:36.760> is<00:12:36.839> a as a reminder that although this is a as a reminder that although this is a lot<00:12:37.040> of<00:12:37.240> fun<00:12:37.639> when<00:12:37.720> you<00:12:37.839> interpret<00:12:38.240> these lot of fun when you interpret these lot of fun when you interpret these graphs<00:12:38.680> you<00:12:38.800> have<00:12:38.920> to<00:12:39.000> be<00:12:39.160> very<00:12:39.320> careful<00:12:39.600> and graphs you have to be very careful and graphs you have to be very careful and you<00:12:39.800> have<00:12:39.920> to<00:12:40.040> adopt<00:12:40.440> the<00:12:40.680> best<00:12:40.920> standards<00:12:41.240> in you have to adopt the best standards in you have to adopt the best standards in The The The Sciences<00:12:43.440> people<00:12:43.639> have<00:12:43.760> been<00:12:43.839> using<00:12:44.079> this<00:12:44.160> for Sciences people have been using this for Sciences people have been using this for all<00:12:44.440> kinds<00:12:44.600> of<00:12:44.720> fun

purposes<00:12:53.240> actually<00:12:53.519> we're<00:12:53.639> not<00:12:53.720> going<00:12:54.040> have
purposes actually we're not going have purposes actually we're not going have to<00:12:54.240> to<00:12:54.399> talk<00:12:54.560> we'll<00:12:54.720> just<00:12:54.839> show<00:12:54.959> you<00:12:55.079> all<00:12:55.240> the to to talk we'll just show you all the to to talk we'll just show you all the slides<00:12:55.920> and<00:12:56.680> remain<00:12:57.199> silent<00:12:58.199> this<00:12:58.320> person<00:12:58.519> was slides and remain silent this person was slides and remain silent this person was interested<00:12:58.839> in<00:12:58.880> the<00:12:59.360> of<00:12:59.480> frustration<00:13:00.440> uh interested in the of frustration uh interested in the of frustration uh there's<00:13:00.959> various<00:13:01.959> various<00:13:02.240> types<00:13:02.399> of there's various various types of there's various various types of frustration<00:13:03.160> if<00:13:03.240> you<00:13:03.320> stub<00:13:03.639> your<00:13:03.760> toe<00:13:04.079> that's frustration if you stub your toe that's frustration if you stub your toe that's a<00:13:04.320> 1<00:13:04.560> a a 1 a a 1 a ARG<00:13:06.519> if<00:13:06.680> the<00:13:06.800> planet<00:13:07.079> Earth<00:13:07.320> is<00:13:07.440> annihilated ARG if the planet Earth is annihilated ARG if the planet Earth is annihilated uh<00:13:08.399> by<00:13:08.560> the<00:13:08.680> Vogons<00:13:09.240> to<00:13:09.360> make<00:13:09.480> room<00:13:09.639> for<00:13:09.760> an uh by the Vogons to make room for an uh by the Vogons to make room for an Interstellar<00:13:10.560> bypass<00:13:11.240> that's<00:13:11.399> an<00:13:11.519> 8A Interstellar bypass that's an 8A Interstellar bypass that's an 8A ARG<00:13:13.600> this<00:13:13.760> person<00:13:14.000> studied<00:13:14.519> all<00:13:14.680> the<00:13:14.800> args ARG this person studied all the args ARG this person studied all the args from<00:13:15.399> 1<00:13:15.760> through<00:13:16.000> 8<00:13:16.399> A's<00:13:17.040> and<00:13:17.360> uh<00:13:17.480> it<00:13:17.560> turns<00:13:17.880> out from 1 through 8 A's and uh it turns out from 1 through 8 A's and uh it turns out that<00:13:18.639> the<00:13:19.639> less<00:13:20.040> frequent<00:13:20.480> args<00:13:20.880> are<00:13:21.040> of that the less frequent args are of that the less frequent args are of course<00:13:21.399> the<00:13:21.519> ones<00:13:21.680> that<00:13:21.839> correspond<00:13:22.240> to course the ones that correspond to course the ones that correspond to things<00:13:22.480> that<00:13:22.560> are<00:13:22.680> more<00:13:22.800> frustrating<00:13:23.399> except things that are more frustrating except things that are more frustrating except oddly<00:13:24.399> in<00:13:24.519> the<00:13:24.680> early<00:13:25.160> 80s<00:13:26.160> uh<00:13:26.800> we<00:13:26.959> think<00:13:27.160> that oddly in the early 80s uh we think that oddly in the early 80s uh we think that might<00:13:27.480> have<00:13:27.639> something<00:13:27.839> to<00:13:27.959> do<00:13:28.079> with<00:13:28.240> Reagan might have something to do with Reagan might have something to do with Reagan all<00:13:30.760> right<00:13:31.199> the<00:13:31.320> bottom<00:13:31.639> line<00:13:31.839> is<00:13:32.079> okay<00:13:32.199> there all right the bottom line is okay there all right the bottom line is okay there are<00:13:32.399> many<00:13:32.560> usages<00:13:32.959> of<00:13:33.079> this<00:13:33.240> data<00:13:33.519> but<00:13:33.639> the are many usages of this data but the are many usages of this data but the bottom<00:13:33.920> line<00:13:34.079> is<00:13:34.160> that<00:13:34.279> the<00:13:34.360> historical bottom line is that the historical bottom line is that the historical record<00:13:35.240> is<00:13:35.399> being<00:13:35.600> digitized<00:13:36.399> Google<00:13:36.720> has record is being digitized Google has record is being digitized Google has started<00:13:37.279> to<00:13:37.440> digitize<00:13:37.880> 15<00:13:38.160> million<00:13:38.399> books started to digitize 15 million books started to digitize 15 million books that's<00:13:38.760> 12%<00:13:39.320> of<00:13:39.440> all<00:13:39.560> the<00:13:39.639> books<00:13:39.839> that<00:13:39.959> have that's 12% of all the books that have that's 12% of all the books that have ever<00:13:40.320> been<00:13:40.519> published<00:13:40.880> it's<00:13:41.079> pretty<00:13:41.279> big<00:13:41.720> it's ever been published it's pretty big it's ever been published it's pretty big it's a<00:13:41.920> sizable<00:13:42.240> chunk<00:13:42.519> of<00:13:42.639> human<00:13:42.880> culture<00:13:43.519> there's a sizable chunk of human culture there's a sizable chunk of human culture there's much<00:13:43.839> more<00:13:44.000> to<00:13:44.120> human<00:13:44.320> culture<00:13:44.600> there's much more to human culture there's much more to human culture there's manuscripts<00:13:45.360> there's<00:13:45.519> newspapers<00:13:46.360> there's manuscripts there's newspapers there's manuscripts there's newspapers there's things<00:13:46.720> that<00:13:46.800> are<00:13:46.920> not<00:13:47.079> text<00:13:47.399> like<00:13:47.519> art<00:13:47.720> and things that are not text like art and things that are not text like art and paintings<00:13:48.839> this<00:13:49.199> will<00:13:49.480> happen<00:13:49.720> to<00:13:49.920> be<00:13:50.240> on<00:13:50.399> our paintings this will happen to be on our paintings this will happen to be on our computers<00:13:51.279> on<00:13:51.480> computers<00:13:52.199> across<00:13:52.519> the<00:13:52.600> world computers on computers across the world computers on computers across the world and<00:13:52.959> when<00:13:53.199> that<00:13:53.360> happens<00:13:53.680> that<00:13:53.839> will and when that happens that will and when that happens that will transform<00:13:54.600> the<00:13:54.720> way<00:13:55.120> we<00:13:55.279> have<00:13:55.399> to<00:13:55.839> understand transform the way we have to understand transform the way we have to understand our<00:13:56.160> past<00:13:56.360> our<00:13:56.560> present<00:13:56.880> and<00:13:57.600> human<00:13:57.880> culture our past our present and human culture our past our present and human culture thank<00:13:58.560> you<00:13:58.680> very<00:13:58.839> much thank you very much thank you very much [Applause]

