Why is this interesting? - The Machine Learning Edition
On recurrent neural networks, McLuhan, and experimentation
|Noah Brier||Jul 16, 2019|| 3|
This is a piece I wrote about two years ago but only sent out to a few people. I thought I’d share it here as it still feels relevant. - Noah (NRB)
The medium is the revolution in the mechanical and problems. The composition of the press are experience the most transformation of the decisions like the translating a great society, the press can be done for story of the poets and feeling of printing of the printed word. The complete point of the integration, the sense with the community that has to be a discontinuity and the printed word and inner lives. The fact that the telegraph and information to a printing and newspaper, the transforming popular structure in the computer. The movie was a kind of the hot change and in a means of explosive and print or much of the reversals of information.
RNNs are a fairly common machine learning technique that does its best to mimic the way neurons connect. As best I understand (and I'm still learning two years later), they work by building a big network of weighted values that help it make inferences that can be translated into a whole bunch of different things. What's amazing about this is that it starts knowing nothing and learns to do whatever you ask of it (in this case write like McLuhan) by strengthening and weakening connections. In other words, the computer has no idea it's writing like McLuhan, or even that it's writing words. All it knows is that according to whatever text you've given it, this letter tends to come after that letter and punctuation is most frequently used after this combination.
In the example at the top, I got the sample started by asking it to build off "The medium is" and the RNN did the rest. If you really want to dig in with this technique, Andrej Karpathy’s "The Unreasonable Effectiveness of Recurrent Neural Networks" seems to be a pretty good place to start. It offers a few fascinating examples including this paragraph trained on 100MB of Wikipedia:
Naturalism and decision for the majority of Arab countries' capitalide was grounded by the Irish language by [[John Clair]], [[An Imperial Japanese Revolt]], associated with Guangzham's sovereignty. His generals were the powerful ruler of the Portugal in the [[Protestant Immineners]], which could be said to be directly in Catonese Communication, which followed a ceremony and set inspired prison, training. The emperor travelled back to [[Antioch, Perth, October 25|21]] to note, the Kingdom of Costa Rica, unsuccessful fashioned the [[Thrales]], [[Cynth's Dajoard]], known in western [[Scotland]], near Italy to the conquest of India with the conflict. …
As he points out in the article, what's particularly cool about this is how it learns to open and close parentheses in a markdown-like style. Again, it doesn't know why it's doing it, it just knows that's a part of the pattern.
I also tried to train the network off my blog posts and didn't have quite the same success as the McLuhan text. With that said, it did spit out some fun misspellings (contextual quote in parentheses): Prical ("transparently that makes a started prical products of people"), transumers ("internet in a transumers than started that the control post"), managelism ("city of the big managelism that all the post of the story"), and one from McLuhan, numerage ("that all of the written form of the press in the consumer means of power and numerage"). Seems like a perfect way to generate buzzwords.
My big takeaway is that the bar to get these things ok is low and great is high. While I don’t want to read too much into a few experiments, this does seem to match some of what’s going on in AI/ML more broadly, where early wins with simple problems still haven’t translated into the full-fledged intelligence that seemed inevitable. (NRB)
Google Image Search of the Day:
From this excellent Twitter thread by physicist Sabine Hossenfelder: “Curious find: A Google image search for ‘futuristic’ returns almost exclusively images with blue/black color themes. How is that? Why isn't the future orange? Very puzzled about this.” She follows that with the colors of tech (black/blue), history (sepia), and truth (black and white) amongst others. (NRB)
If you’re hungry for a much deeper dive into machine learning The Master Algorithm is a good book on the various techniques. (NRB)
The Atlantic on the design guide that shaped how cities design bike lanes. “The result was NACTO’s Urban Bikeway Design Guide, the first national design standard for protected bike lanes. Like other standards, it answers the questions of space, time, and information that are at the heart of street design. How wide should a protected bike lane be? At least five feet, but ideally seven. How does one mix bike lanes and bus stops? Send the lane behind the bus stop, with enough space for bus riders to comfortably board and get off the bus. What about when bike lanes and turn lanes meet? Give bikes their own exclusive signals, or create ‘mixing zones,’ shared spaces where people in cars and on bikes take turns entering the space.” (NRB)
This Times story about how Agatha Christie disappeared for 11 days is pretty wild. (NRB)
Thanks for reading,
Noah (NRB) & Colin (CJN)