AI Snake Oil book by Arvind Narayanan & Sayash Kapoor, 2024.09

800 words or 5,000 characters summary instead of a 300 pages book. My other summaries are here

Basically, it is about how bigtechs oversell/scared the world to make more $ on AI & try to monopolize it. Arvind Narayanan & Sayash Kapoor are scientists from Princeton, and also worked in industry at Netflix and Meta. Snake oil is a panacea, a mythical cure for everything:

I. Introduction

– The problem is that: 

1) The word AI has become a vague name for anything 

2) Using overselling and people’s misunderstanding 

– they scare people. Before ChatGPT there was Github Copilot and who cares

– AI still has no definition:

1) Something that requires us to study or be creative

2) Not code/rules, but something with emergence 

3) Systems with some level of autonomy

– It is necessary to distinguish between 2 types of AI: generative (good to the level of hype) and predictive (often does not really work)

II. What’s wrong with predictive AI?

– We are surrounded by systems evaluating us with algorithms everywhere, but they fail, and on Netflix it is harmless, but in the analysis of candidates, students, patients, employees, defendants, prisoners, etc. – these are critically influencing life “decisions”

– Predictive analytics firms do NOT test software like in pharma (randomized controlled trials) and the result is bad

– And even more so, they do not study the consequences of selling their software, e.g. in Australia, 400K recipients of social benefits were falsely accused of enriching themselves

III. Is it possible to predict the future?

– If you make adjustments for data leaks in datasets, then the top ML forecasts are no better than classic, ancient regressions. AI sellers oversell enormously + the media helps to hype, but in fact – it works bad. The FTC even wrote a note year ago about deception

– Moreover, the growth of data (intelligence and big tech would like to track our every word) does not improve forecasts about people. And weather, pandemics, careers and before tweets, Harry Potter, Star Wars, Orwell, YouTube (the accounts’ top1 videos are 40x better than the median), etc.

– Part of the future will remain in the fog regardless of the amount of data used

IV. The long road to GenAI

– Before gen AI, it started from the perceptron, and it was discovered 70 years ago. Hinton generalized deep learning in 1986 (and received the Turing Award – it is like Nobel), ImageNet in 2012, GPT2 in 2019…

– GenAI brought deepfakes, fraud, glitches, dirty and biased data

– In code generation – it works, but in pictures, etc. – everything is based on underpayment of royalties to authors who got into the datasets

V. Is AI an existential risk?

– AI panic is a tower of misconceptions and vague generalizations. In essence, everything is based on the assumption that AI will allegedly cross a certain level of autonomy or superhuman intelligence. But this contradicts the entire history of tech, which always developed gradually

– BTW, all AGI predictions AGI have failed last 70 years

– Let’s solve specific & narrow problems, like in information security field

VI. Why AI will not solve the problems of social networks?

– 1. AI does not moderate well + it makes false-positive mistakes 2. The [digital] public space has been transferred to the hands of unaccountable private companies – freedom of speech has been wiped out. This is since 2018: the US Congress attacked Facebook, and famous 1996 interned law was basically disavowed. Moderation became the main product of social networks

– Twitter is strictly moderated even after Musk’s purchase (violence, children, hate speech, etc.). Users can complain, and the algorithms can check themselves (Facebook has moderators in few countries, from this point of view Telegram is not unique)

– Knowledge of cultural contexts for the whole world is also quite difficult for moderators

– Moreover, appeals in social networks/Uber/DoorDash, etc. take a long time – and people lose their jobs/incomes

– Social networks suffer from targeted attacks by state bot networks (like replacing words with something people will understand, but algorithms can’t)

VII. Reasons for myths about AI

– The general public is not aware that there were both hypes and “winters” in AI

– Journalists do not have time/cannot (they are not engineers) check the big tech’s propositions

– Rich corporations, without any incentives for transparency, crushed the academy for their A) marketing goals and B) political purposes when they talked about AI. Including referring to unscientific “research”. Even Henry Kissinger was involved into media narratives with his last book

VIII. What’s next?

– We need rules on how companies can advertise AI products. Especially in predictive AI, where most software doesn’t work

– The main downsides of AI are in usinig. And, by the way, the main demand for an AI panacea is from broken, ineffective organizations, which look for quick solutions to everything, wasting their valuable resources

– Aren’t you surprised that bigtechs are actively lobbying for regulation of themselves?