Alex O'Connor is a researcher and machine learning manager.
I had the chance to pick his brain on the latest trends of generative AI — transformers, language and image models, fine-tuning, prompt engineering, tokenization, the latent space, adversarial attacks, and more.
Thanks to everyone who chatted with us during the YouTube premiere.
★ I'm excited to celebrate Live 100 with Special Guests tomorrow, April 27, at 1:30 PM ET with a conversation on creative machine intelligence with Adam Menges, Joel Simon, José Luis García del Castillo, and Kyle Steinfeld.
I'd love for you to join us live at nono.ma/live/100.
Recorded at The Palazzo, Las Vegas on December 2022.
00:00 · Introduction
00:40 · Machine learning
02:36 · Spam and scams
15:57 · Adversarial attacks
20:50 · Deep learning revolution
23:06 · Transformers
31:23 · Language models
37:09 · Zero-shot learning
42:16 · Prompt engineering
43:45 · Training costs and hardware
47:56 · Open contributions
51:26 · BERT and Stable Diffusion
54:42 · Tokenization
59:36 · Latent space
01:05:33 · Ethics
01:10:39 · Fine-tuning and pretrained models
01:18:43 · Textual inversion
01:22:46 · Dimensionality reduction
01:25:21 · Mission
01:27:34 · Advice for beginners
01:30:15 · Books and papers
01:34:17 · The lab notebook
01:44:57 · Thanks
After two months of pause, we're preparing to release new Getting Simple podcast episodes.
Editing and publishing add friction and delays to my process, so I'm exploring code and ML workflows to post-process of episodes' audio and generate transcripts, summaries & notes.
I'm not there yet. But OpenAI's Whisper (free) and Descript (paid) already provide accurate transcriptions. Existing projects and companies use #GPT-like language models to extract episode keywords, topics, chapters & summaries.
We'll soon have automatic episode notes.
It's exciting. I think we're getting very, very close.
I've also played with Spotify's
pedalboard Python package to post-process audio without relying on a Digital Audio Workstation (DAW).
That's cool because I can create reusable scripts for specific recording conditions and forget about audio editing — say, compressing, limiting, applying noise gates, or normalization—things you'd otherwise do in Adobe Audition.
Let me know if you'd like to see these automations in the live stream and video tutorials or shared here on Twitter at @nonoesp.
Zach Kron was the second guest of the podcast back in December 2017. Since then, I've enjoyed many conversations with him that would have been a great fit for the show.
I recently visited Zach's studio in Sommerville, Massachusetts, where we talked about making (and selling) pen plotter art, evolving with your projects, creativity, capturing ideas, and remote work.
I loved recording this conversation in person and on camera.
I hope you enjoy it!
00:00 · Intro
01:00 · Evolution of this podcast
09:46 · Freediving
12:16 · Capturing ideas
13:25 · People are different in person
15:54 · Evolving with your projects
20:10 · Connecting with your audience
21:20 · Live vs. offline
26:19 · The creative medium
30:00 · Selling art
38:46 · Pen plotter art
46:34 · Making art with Dynamo
50:31 · Art
01:02:53 · Funktionlust
01:05:09 · Remote work
01:11:54 · Outro
I recently had a conversation with Steve — who wants to build a YouTube channel about the joy of making and listening to music, emphasizing health and well-being — where I shared tips on producing a podcast, building an audience, booking guests, content formats, motivation, goals, and other insights from five years of podcasting.
This episode may be helpful if you're thinking of starting a podcast or YouTube channel or if you want to learn about my podcasting workflow.
Steve’s questions (below) acted as a guide for our conversation.
Remember that you can submit your questions at gettingsimple.com/ask.
00:00 · Introduction
00:58 · Start
01:55 · Steve's idea
03:45 · Passion for music
04:37 · Podcasting
05:20 · Motivation
08:04 · Recording and editing
09:07 · Guests
11:40 · Building an audience
14:01 · Long-form conversations
15:34 · Process
17:33 · Goals
21:51 · Evergreen content
24:14 · Monetization
25:38 · Start lean
29:30 · Outline
31:59 · First episodes
33:17 · Outro
Today I bring you a new conversation with Leire Asensio Villoria and David Mah on decoding and upgrading design systems, reverse engineering the creative process, knowledge dissemination, the long tail of niches, Erwin Hauer and associative models, book writing and publishing, and much more.
Leire and David collaborate as asensio_mah, an international design practice with projects in Europe and Australia, including Casa Q, a new-build residential commission in Spain that integrates digital fabrication techniques with traditional construction practices as well as landscape engineering and design.
00:00 · Introduction
00:36 · Erwin Hauer
02:22 · Associative models
04:18 · Erwin Hauer's model making
07:03 · Limitations of digital tools
09:39 · Systems Upgrade book
11:10 · Reverse engineering
26:09 · Decoding Erwin Hauer
30:21 · Authorship and knowledge dissemination
36:48 · Visual programming
41:39 · Selling less of more
46:54 · Individualizing everything
49:23 · Context
53:18 · Book writing and publishing
01:02:49 · Creative process
01:11:13 · AI content generation
01:17:42 · Thanks
01:18:43 · Outro
Zoom recently limited free 1:1 meetings to 40 minutes. This limitation only applied to meetings of three or more people before, and it now makes it impossible for me to record long-form interviews for the podcast using my free Zoom account. The price of Zoom paid plans starts at around $140/year, and for a bit more, I'm going to go for a tool that targets my workflow much better: Riverside.
A few years back, I looked at Soundtrap (later acquired by Spotify), which also serves to record remote podcasts. I don't think they supported video and local recordings, which are at the core of Riverside's offerings.
For $288/year, Riverside lets you record up to 4K2, accept live call-ins from the audience, download separate tracks perfectly synchronized, and my killer feature, export the entire timeline into Descript as compositions and sequences.
A few minutes before writing these lines, I tested the free version with two people. I was able to record each person's webcam feed at 720p with synchronized audio and import the timeline into Descript. The tracks get automatically transcribed and named, and a Sequence is created for me with both tracks. I can quickly apply Studio Sound if I wanted, or I can download the original tracks, edit them in another software, and replace them in Descript's sequence.
I'm glad I learned about Riverside and didn't get a Zoom subscription. I don't plan on having 100-people events or meetings over Zoom, and I'm usually fine with meeting for less than forty minutes at a time or asking my invitees to re-join my meeting.
To record at 2160p (4K), you need the right hardware. For instance, I use the Sony Alpha a6500 with Elgato HD60 S+ video capture card. ↩
Today I bring you a new conversation with Frank Harmon, an old friend who taught me architectural design at North Carolina State University, Raleigh, back in 2012, and inspired me to look at the world differently.
Frank is a renowned award-winning architect, professor, writer, and an avid sketcher who always has a sketchbook with him.
He writes to find out what he’s thinking and draws to understand what he’s looking at to ensure he doesn’t forget it.
In this episode, we talk about writing, drawing, design, life, and how digital technologies make the world completely placeless. "It’s too late to stop [the internet], but what we can do as architects and artists and writers is give people a sense of place where they are."
Frank believes we can make places that have something physical and concrete grounding us in an otherwise unlimited digital world.
00:00 · Introduction
01:14 · Writing
05:00 · Becoming an architect
06:21 · Frank's book
07:19 · Living in London
09:03 · Studying abroad in the US
13:37 · Childhood place
20:38 · Born with screens
23:39 · Design
27:42 · Place
33:41 · Good architecture
37:10 · Bad architecture
38:48 · Frank Gehry's middle finger
39:31 · Native Places: Drawing as a Way to See
43:47 · The best way to write
44:23 · The purpose of sketching
45:45 · Thanks
46:09 · Outro
Voice notes are submitted via SpeakPipe and can be recorded with most modern browsers.
Videos are uploaded via Dropbox File Requests and can be recorded with a smartphone or any camera as long as you can upload the video file to Dropbox using the web form.
I invite you to ask a question about any of the topics we discuss in the podcast and the live stream.
Today I bring you an episode on my first impressions and experiments with OpenAI’s text-to-image generation AI system DALL-E 2, three mini-essays on the creative process and being done, and blogging tools you can use to publish online.
00:00 · Introduction
00:39 · An incoming conversation with Frank Harmon
01:34 · Episode contents
01:55 · DALL-E 2 and AI systems
06:45 · What can DALL-E do?
09:54 · GPT-3: Language models
12:05 · Mini-Essays on the creative process
12:15 · Mini-Essay: The meaning of done
13:38 · Mini-Essay: If it's no fun, you shouldn't do it
15:33 · Mini-Essay: Another one of those
16:51 · Writing series
17:51 · What does Nono use for blogging?
22:45 · Outro
Today I bring you a short episode from the sketches series in which I share my experience traveling to the US and meeting people in person for the first time after the COVID-19 pandemic.
Here are the recently added chapters to my conversation with Scott Young from 2019 on Ultralearning: how to master skills and acquire knowledge quickly.
Here are a few of my favorite quotes from Getting Simple's latest episode with Andrew Witt.
Today I bring you a solo episode in which I revisit my current habits and passion projects. I share my thoughts on the podcast, the blog and sketches, the YouTube channel and the live stream, my new recording studio, monetization, crypto, and the importance of learning and play.
Looking forward to hearing what you think!
00:00 · Introduction
01:34 · Daily habits
05:08 · Active projects
07:27 · Blog
09:11 · Sketches & stories
15:06 · Studio
17:06 · Podcast
21:36 · YouTube channel
23:24 · Knowledge anxiety
24:43 · Anything, not everything
25:45 · Monetization
28:41 · Learning and play
32:22 · Crypto and digital art
36:21 · I need your help
39:50 · Outro
In yesterday's live stream—Live 69—I recorded a solo episode revisiting my current habits and passion projects, in which I talked about my new recording studio, the podcast, the blog and sketches, the YouTube channel and the live stream, and the importance of learning and play. I also showed Farrago, a piece of software I've started using for live sound effects.
Use the timestamps below to jump to specific parts of the stream.
Thanks for watching.
See you next week!
Today I bring you a new conversation with Andrew Witt, who teaches at Harvard University and recently published Formulations, a book that explores how computational tools that encapsulate mathematical methods are short-circuiting the path to expertise, blurring the distinction between dabbler and virtuoso, and democratizing access to the systems and aesthetics of mathematical design.
Please enjoy Andrew’s second podcast appearance in which we discuss how mathematical design transforms how we think, design, and make art, how Andrew managed to get such a big project together, and his take on writing, creativity, work, life, machine intelligence, and digital art.
Today I bring you a new conversation with Adam Menges, a former Apple employee and founder at Lobe, a company acquired by Microsoft that aims to make deep learning accessible.
Please enjoy Adam’s second podcast appearance in which we discuss the role of visual programming languages, social fintech, thoughts on Bitcoin and digital art tokens, and lessons learned building successful software products during and after pandemic times.
In this episode, Nate Peters and I discuss the latest developments of digital art and generative NFTs, the importance of being intentional, the advantage of established creators, and the fast pace of artificial intelligence and crypto.
We’re no experts, so please don’t take our words as financial advice. We just hope our conversation sheds some light in your own path to learning more about the world of digital currencies, machine learning, and technology.
Take a look at this episode's topics in the notes and chapters below.
Today, I bring you a conversation with an anonymous guest on blockchain and cryptocurrencies, smart contracts, the security of digital wallets, the convenience of centralization, promoting positive moral behavior, impostor syndrome, being a constant newbie, and lots more.
I first met Jordan in North Carolina back when I was an exchange student on my fourth year of architecture school. We soon realized we shared many creative interests and curiosities.
Ten years later, we bring you a long-form conversation on creative friction, the fine line between passion projects and work, storytelling in design, overcoming the beginner feeling, and lots more. (Take a look at this episode’s chapters to get a better picture of the topics we covered.)
As we embrace new technologies, we delegate more and more tasks and decisions to the machine. In turn, algorithms permeate our daily lives—say, influencing what we listen to, watch, read, or who we interact with—yet few of us know how they use our information and make decisions for us.
In this episode, I talk to Aziz about how complex machines work, technological polarization, and the growing need to make algorithms understandable.
We hosted two live events on YouTube to record a two-part podcast celebrating one year of live streams. The second part, out now, features a conversation with Jose Luis García del Castillo on teaching and coding live.
We talk about friction and automation, community, practice, content creation, and how the podcast and the live streams have evolved.
Special thanks to the community and to everyone who joined us live. ❤️
We hosted two live events on YouTube to record a two-part podcast celebrating one year of live streams. The first part, out today, features audience questions on the challenges and evolution of the channel after a year of streaming (almost) weekly.
I mentioned way too many things in this episode, so I tried my best at linking to most terms and technologies in the notes below.
It's hard to keep up with the fast-moving world of digital currencies and the new age of digital art.
In this new episode of Bytes, Aziz and I talk about non-fungible tokens (NFTs), blockchain, cryptocurrencies, and digital art.
Today, I bring you an informal chat with Nate Peters, a friend and former guest of the show—a conversation on the machine learning-based audio-editing solution this podcast is being produced with, web components, React and UI libraries, the effects of COVID-19 in our work lives, NFTs and cryptocurrencies, and the new informal catch-up conversation podcast format we're testing out.
We were screen-sharing during part of this conversation and no recording is available. But we've compiled a detailed list of episode notes, and the YouTube video includes a full transcript as closed captions.
In this new episode of Bytes, Aziz and I talk about StyleGAN, NVIDIA's state-of-the-art machine learning algorithm that generates convincing images.
If you want to learn more about the Bytes series, our co-host, and what to expect in future episodes, Listen to the Introduction.
I brought a minimal recording setup inside my backpack to Tenerife—two Shure SM58 microphones and a Zoom H6 recorder—just in case I found a chance to record material for the podcast.
Before parting ways at the boarding gate, Jose Luis and I captured our first impressions after a week of freediving classes; what we learned, what we loved, and things we thought we knew but didn't.
We talked about the mindfulness of breath-hold diving and being deep underwater, best practices, equipment and techniques, equalizing your middle ear pressure, scuba versus freediving, and how recommendation systems brought us there.
Technologist and artist Cristóbal Valenzuela co-founded Runway with a simple idea in mind: putting machine learning in the hands of creators as an intuitive and simple visual interface.
Enjoy this conversation with Cris on the need for new creative interfaces to control complex algorithms that focus on results (not technology), the freedom of being a startup, and how machine intelligence is changing how we think, design, and make art.