Sources and Methods #44: Deep Learning with fast.ai's Jeremy Howard
Jeremy Howard 101:
Jeremy on Twitter: JeremyPHoward
Free online programme / MOOC (“Practical Deep Learning for Coders”) at: fast.ai
“The wonderful and terrifying implications of computers that can learn” (YouTube)
Show Notes:
5:55 - My entire education is one degree in philosophy.
7:30 - Joined McKinsey at 18 with extremely basic knowledge.
12:19 - At Fast.ai our target audience really is people who have interesting and useful problems, and have a feeling that using AI might be a useful way to do that, that maybe don’t have a background in machine learning. It’s the people I came across in my career who were working in extremely diverse industries and roles and geographies, who are smart and passionate and working on interesting and important problems but don’t have any particular background in computer science or math. There’s a snobbish-ness in machine learning, that most people in it have extremely homogeneous backgrounds, young, white, male, who have studied computer science at a handful of universities in America or Europe.
David Perkins at Harvard, and his learning theory of the ‘Whole Game.’
18:10 - For some reason, the STEM field on the whole have gotten away with shoddy, slack teaching methods, where we expect the students to do the work of sticking with it for 10 years and putting it all together.
20:02 - We’ve discovered that the most practical component in AI is transfer learning. Taking a model that someone else has created and fine tuning it for your task. It turns out that this is the most important thing by far for actually getting AI to work in the real world. Apply and transfer learning effectively.
I think many people teach a list or a menu of things that they know, rather than really getting to student learning.
22:41 - Each year, we try to get to a point where the course covers twice as much as the previous year, with half as much code, with twice the accuracy at twice the speed. So far, we’ve been successful at doing that three years running.
28:48 - I think that will be one of the two most important skills over the next decade or two - the idea of how to work as a domain expert to provide appropriate data to a machine learning system and to interpret the results of those things in a way appropriate to your work. If you don’t know how to do it, you’re going to be totally obsolete.
31:09 - Back in the early days of the commercial internet, being an internet expert was extremely useful and you could have a job as an internet expert and be in a company of internet experts, and sell yourself as an internet expert company. Today, very few people do that, because on the whole the internet is what it is, and there’s a relatively few number of people who need such a level of expertise that they can go in and change the way your router operates and such. I think we’re going to see the same thing with AI.
39:08 - I started learning Chinese not because I had any interest in Chinese, but because I was such a bad language learner in highschool. I did six months of French, I got 28% and I quit. When I wanted to dig into machine learning, I thought one of the things that might be better to understand was human learning, so I used myself as a subject. A hopeless subject. If I can come up with a way that even I can learn a language, that would be great. And to make sure that was challenging enough, I tried to pick the hardest language I could. So according to according to CIA guidelines, Arabic and Chinese are the hardest languages for people to pick up. Then I spent three months studying learning theory, and language learning theory, and then software to help me with that process.
It turns out that even I can learn Chinese. After a year of this - by no means a full time thing, an hour or two a day - I went to China to a top language learning program and based on the results of my exam got placed with all these language PhDs, and I thought wow. Studying smart is important. It’s all about how you do it.
Spaced repetition is such an easy thing that anyone can do, for free, you can start using it.
[Jeremy’s amazing Anki talk]
If you’re not using Anki, you’re many orders of magnitude less likely to remember a piece of vocab. So you come away like I did, thinking you can’t learn a language. But once you learn vocab, the rest is really not that hard. Don’t try to learn grammar, just spend all your time reading.
45:04 - If you’re not spending a significant portion of your early learning, learning how to learn, then you’re going to be at a disadvantage to those that did for that entire learning journey. Spending 12 years at school learning things, but nobody ever thought you how to learn, is the dumbest things I’ve ever heard.
Coursera’s most popular course is Learning How To Learn.
Exercise is the other most important thing.
49:03 - My third superpower is taking notes. Exceptional people take a lot of notes. Less exceptional people assume they’re going to remember.
50:19 - Taking notes in class is kind of a waste of time. I don’t really see the point of going to class most of the time honestly, it’s probably being videotaped.
52:54 - Learn Python if you’re interested in data science, deep learning.
54:22 - I think there are two critical skills going forward, pick one. One is knowing how to use machine learning. And the other is knowing how to interact with and care for human beings. Because the latter one can’t be replaced by AI. The former one will gradually replace everything.
Sources and Methods #43: Teaching Programming with Matthias Felleisen
Show Notes:
Bootstrap World - Teaching Outreach Project
Racket-lang.org - Our Research Programming Language
7:00 - I am a transplant from Germany. I came as a Fulbright student when I was 21. Fulbright decided that my major in Germany corresponded to an MBA combined with an engineering degree, which they thought was MIS, or Management Information Systems. So they put me in Tucson, Arizona.
I fell in love with the place. I moved into a house with a bunch of people. One of them was an astrophysics PhD student. And he told me what a PhD was, because I had no clue. And he said ‘A PhD is when they pay you to think.’ I couldn’t believe it. I almost tripped when he said it - they pay you to think? Sign me up.
I switched majors to Computer Science... Went to a PhD program in Indiana. After 3.5 years, it was time to go, and the choices were between Berkeley and Rice, my top two offers. It took me only a few short seconds to decide that Rice was it. Because when I interviewed there, 8 out of 12 people had published in the conference that I consider my home.
At Berkeley, out of 45 people, only 1 person was even in that realm. So I decided to go to Rice.
9:45 (On moving to Northeastern) Northeastern definitely had a plan to turnaround and become a research university. And it’s always fun - just like I had decided to go to Rice, Rice was ranked in the top 20, but Berkeley was probably #1 or 2 in computer science - but it’s always fun that’s at a place that’s moving up. And Northeastern had a plan to move up. And they have risen in Computer Science, it’s on the map. I was able to create a programming language group that is definitely one of the best in the country.
12:20 - Functional programming for the lay person is basically what you learn in Algebra, in middle school or high school. Let me explain it.
Algebra, when you learn it, is the weird idea of maybe getting a word problem and devloping an expression that describes the problem in there and then plugging variables into the expression and calculating out the results. And then nobody looks at it, and throws it away. The teacher throws it away. That’s algebra, and it sounds weird that functional programming is the same as algebra, but it is.
What you write down in a functional program are these functions or variable expressions as some algebra textbooks call them. And the big difference is that in addition to numbers, you have other forms of data. In algebra, you can think of expressions - just about manipulating numbers. Now imagine that in addition to numbers, you also have texts. For example, the symbol +, which means 1 + 1 is 2. When you say hello + a spacebar + world, gets you the text ‘hello world.’
Just like you manipulate numbers in algebraic expressions, you can manipulate images, texts, and other kinds of interesting forms of data that you have about people, about the world, about anything you like. Functional programming is a very enriched form of algebra.
19:00 - I was probably the first academic in computer science to do a broad based outreach program.
And I saw tremendous problems that children had aligning the knowledge that they had from algebra with what they saw in programming. Because imperative program - or what I sometimes call dysfunctional programming - has a cognitive dissonance to algebra, which is the closest thing kids know.
20:28 (On the lessons from watching children trying to code) I came to a conclusion that is to this day not clear to the vast majority of people who teach computer science. The conclusion is that no programming language that is in use by real programmers is suitable for beginners.
29:29 - At this point, we are the largest organized Computer Science outreach program in schools in the United States. But it took this insight that we can’t radically change education processes.
Let’s incrementally change student behavior, not a wholesale change here.
39:44 - I belive that Computer Science is actually the discipline of developing and applying problem solving processes.
45:40 - (On educational outcomes, measuring them, etc) Let me recommend a book: Let’s Kill Dick and Jane.
54:07 - The United States spends more money on education than any other OECD country. So where does this money go? What is this money used for? I don’t believe it arrives in the classroom. If we spend more money than everyone else already, I don’t believe pumping more money in will solve the problem either. It goes to administrative things, overhead. What we’re seeing in higher education now over the last 20 years is that higher education is catching up with K-12. My own college has grown from 4 or 5 administrative assistants that fit into a small dean’s suite on the floor downstairs, to occupying the entire second floor, which is probably 30-40 people. In just 3 years, since the new dean arrived. If you have a bigger staff, than you look more important. If more people report to you, you’re more important. I have routinely refused promotions to administrative positions. Bureaucracies grow much much faster than then quality in the classroom.
This existed in K-12 education way before it existed in higher education.
To follow my work:
Bootstrap World - Teaching Outreach Project
Racket-lang.org - Our Research Programming Language