We haven't learned anything, over the last 20 years or so. At least, not at the lower levels.
I started programming in 1978, on a Commodore PET. It was an excellent first machine, and taught me a lot about how to write efficient code, and to design before I write.
My next machine was the infamous BBC model B. This machine had to have been God's Gift to Geeks! Parallel banks of memory, ADC converters, the best sound system at the time (4 channels, with sound envelope controls), parallel & serial ports, TWO video ports, and a 2nd processor port.
There's virtually nothing on a BBC B that does not exist today, on standard AMD SMP systems, provided they have a sound card. But, then, there's virtually nothing on a standard SMP system that did not exist on the BBC, either!!!
Ok, now we'll move on to parallel processing. The first "real" parallel processing machine was, of course, Colossus. Yes, that machine. One of the reasons it could break codes quickly was because it could do many things at the same time. It wasn't dependent on one function finishing before it could do something else.
The first -programmable- parallel architecture was the Cray X-MP. (MP is multi-processor.) This was an ingenious design, but a little expensive to be actually practical.
The first -practical- parallel architecture was the Inmos Transputer. This could scale indefinitely. There was no limit of 2, 4, or 8 processors. You didn't need any special chip-sets to make it work. Arrays of 1000+ were commonplace in large Universities in Europe, where a Transputer-based machine could outperform a Cray at 1% of the list-price, and >>1% of the running cost.
The Transputer was a mid 80's architecture, designed as a military-grade system, with as few external components as possible, and as few requirements as possible. If it had been taken seriously by Thorn EMI, and backed by the Thatcher regime, we would not be using Intel processors today. That much is certain. Sadly, Inmos was sold to SGS-Thompson, and hasn't really been heard of since.
Let's get onto stuff that is perceived as modern. Take "neural networks", for example. A nerual network is just a collection of programmable gates. You might as well use an FPGA chip, and spare yourself the complex overtones. You have certain inputs which produce an output, and other inputs which don't. That is nothing more than an n-ary gate. All that has been added is an improvement in the programmability.
Bloat vs. Readability: I've rarely seen code that is genuinely readable. It's often poorly commented and variables are given obscure names. No, I'm not talking about BASIC, although I wish I was. I'm referring to Motif, X11R6, Gnome, KDE and even the Linux kernel itself!
If the benefit of readability actually existed, it would be worth the space. At present, though, readable code, formal specifications, and structured designs simply DON'T EXIST!
Let's take some simple examples. When should you use a global variable? Answer: NEVER! Global variables, especially in multi-threaded code, are totally unpredictable and susceptable to causing dangerous side-effects. Pass in what you want in, and pass out what you want out. That way, if two threads try to grab the same variable at the same time, they each have an instance to play with.
How many exit points should a function have? ONE! One way in, one way out. This may seem time-consuming, but it really does make the code much more predictable, and therefore better.
When should type not matter? NEVER! If you want to enforce a particular behaviour, then cast the type. Otherwise, the behaviour is dependent on the compiler, and the phase of the moon. You can't be sure exactly what will happen. You should NEVER get a single warning about type mismatches.
Out-of-memory errors? NEVER! If you're checking for errors on returns from system functions, you can handle them, in a controlled manner, so that what you want done gets done. If you allow uncontrolled behaviour to occur, then expect the program to explode from time to time.
Algorithms that are new: Hmmmm. Tough. Someone mentioned MP3, but that's just a form of lossy compression, and the loss of data to compress is something that's been around a long time. (The old analog phone system used lossy compression, in a sense, as it supported only a very narrow band, and simply eliminated all sounds outside that band.)