Apple announced Vision Pro yesterday, a mixed-reality headset that "seamlessly blends digital content with your physical space." I wonder about this product's adoption rate.
Cupertino's tech giant has previously redefined product categories and managed to educate the world on why they need them—think of browsing the internet on your iPad or iPhone in bed instead of sitting at your desk.
The device is pricey—north of three thousand dollars. Yet some argue it equals the cost of certain desktop setups if you think about buying a computer, monitor, and other peripherals. Vision Pro is everything you need. "Free your desktop. And your apps will follow." No display. No mouse. No keyboard. "You navigate simply by using your eyes, hands, and voice."
Apple coined the term visionOS to refer to their first spatial operating system, inviting us to an infinite canvas that transforms the way we work.1
I can't wait to try them out.
Apple Vision Pro. Apple. June 5, 2023. ↩
I've been doing a lot of React and TypeScript work lately. It had been a few years since I worked with them on a daily basis, and things are improving a lot. It's a breeze to work with some of these technologies to build web apps, and one of the newest additions that works well is Vite.
Is anyone else working with React these days? I will cover some of my learnings on YouTube and want to get a sense of interest. (Let me know on Discord.)
What's cool is that frameworks such as ONNX and TensorFlow have wide support to export and run models in the browser (web workers, WebGPU, WebAssembly) and you don't even need to build microservices for certain models. (Plus there's now support for Node.js to run in the browser as well!)
I just started a new daily file with my &ndaily
Typinator text expansion.
This expansion archives my current daily file, an action I run whenever a daily file goes over seven thousand words.
It then creates a new file named —01_Daily_Part_94.md
for Daily 94.
A few weeks ago, I paid for a Typinator 9 upgrade. The app is more modern, has light and dark modes, and promises long-term support. I'm glad they did that.
I'm a heavy user of Typinator and, someday, I'll create a list of all the things I use on a daily basis.
One of my most-used expansions—they've added usage stats (!)—is dtt
, which would expand, today, to 230531.
"A general limitation of the human mind is its imperfect ability to reconstruct past states of knowledge, or beliefs that have changed." Says Daniel Kahneman in Thinking, Fast and Slow. "Once you adopt a new view of the world (or of any part of it), you immediately lose much of your ability to recall what you used to believe before your mind changed."
Writing down your past states of knowledge and beliefs—keeping memories—is helpful because they will change.
Even five minutes of daily sketching can get your skill pretty far. It's a way to wire your brain to move your hand how you want it to move. Practice strokes, capture proportions, and decode perspective. It doesn't need to be a big effort as long as you do it every single day. What's key is to enjoy it so those five minutes flow into a longer drawing session when you have time.
Needless to say, you can apply this logic to activities other than sketching that you enjoy doing. I use it daily for sketching and writing.
Leonardo Da Vinci recommended to young artists the practice of finding people around town to use as models and taking note of the most interesting figures with slight strokes in a small notebook they could always carry with them.1
Today, there's no need to draw by hand anymore. The camera captures lights and shadows for you. And AI is learning to do knowledge work.
'When you automate as much of your life as possible, you can spend your effort on the tasks machines cannot do yet.' This is what James Clear says in Atomic Habits. 'Each [automated] habit frees up time and energy to pour into the next stage of growth.'
The amount of tasks machines cannot do yet keeps getting smaller and smaller.
James Clear quotes Alfred North Whitehead—mathematician and philosopher—who wrote, 'Civilization advances by extending the number of operations we can perform without thinking about them.'
The camera misses the point of sketching, which is to understand what you see.2
The machine does the work for us, and we get to skip the struggle. But it is that struggle that makes us learn and master skills.
Imagine a little kid using an AI to draw or color a flower. How is she meant to learn to draw or color when the machine does it for her?
It keeps getting harder to figure out what tasks machines won't be able to do for us. In the future, the skill to be mastered is controlling the machine, what's currently being called prompt engineering.
Technology misses the point of craft.2
No matter how well machines can draw, I will continue to sketch by hand.
Leonardo Da Vinci by Walter Isaacson. ↩
Of course, this is an oversimplification. Many uses of technology and artificial intelligence automate processes that don't provide any value to the human performing the manual task. The gist is that we do certain things for the sake of doing them, not for the outcome and that the journey yields growth to the individual doing the work. ↩ ↩
No matter how small a piece of software is, it requires maintenance.
Except when it doesn't, which is true for certain programs without external dependencies and deprecated features.
The more code bases you rely on or develop, the more tiny efforts you'll have to put here and there to keep them running, especially if you want to keep the operating system up to date.
If I had to pick a word for our current times, that would be impatience.
Impatience for the video to load,
and to be done watching.
Impatience for new episodes,
and to binge-watch the entire season.
Impatience, with whatever is in front of you,
and a great discontent with the present.
We're in constant search for the next thing.
Screens rewired us this way,
and wire young generations from the start.
Impatience to read this post in full,
yet an urge to find more content.
In this speedy world,
the turtle wins.
Seth Godin calls it the dip;
Steven Pressfield the resistance;
Cal Newport shallow work.
Just stick around when everyone else gives up.
In computer science, black-box algorithms are the ones that provide an output to a given input but don't tell you how did obtain that output—you can't see how the machine works.
On the contrary, white box algorithms let you see what's inside. Browse their code, change it, and learn why and how they generate a given output.
Black boxes protect intellectual property, make it harder to hack programs1, and let companies monetize services by putting their services behind a paywall.
White boxes let the community contribute by collectively maintaining their code and adding new features.
Hackers often search for bugs in open-source code to exploit their vulnerabilities. ↩
Hi Friends!
I'm hosting a live conversation today with special guests to celebrate my 100th YouTube live stream, Thursday, April 27, at 10:30 AM Pacific Time.
I've invited Adam Menges (ex-Lobe.ai), Joel Simon (Artbreeder), Jose Luis Garcia del Castillo (Harvard, ParametricCamp), and Kyle Steinfeld (University of California, Berkeley) to pick their brains on creative machine intelligence and how it's being used in academia and next-generation design tools.
The conversation will take place in Riverside at nono.ma/live/100.
With that link, you'll join as part of the audience and can participate in the chat. There's an option to "call in" and join the call, which we could use for questions or even to have everyone who wants to join at the end of the call.
Feel free to forward this invite to friends interested in AI & ML.
Thanks so much for being part of my journey.
Warmly,
Nono
Hi Friends—
Alex O'Connor is a researcher and machine learning manager.
I had the chance to pick his brain on the latest trends of generative AI — transformers, language and image models, fine-tuning, prompt engineering, tokenization, the latent space, adversarial attacks, and more.
Thanks to everyone who chatted with us during the YouTube premiere.
★ I'm excited to celebrate Live 100 with Special Guests tomorrow, April 27, at 1:30 PM ET with a conversation on creative machine intelligence with Adam Menges, Joel Simon, José Luis García del Castillo, and Kyle Steinfeld.
I'd love for you to join us live at nono.ma/live/100.
Warmly,
Nono
Recorded at The Palazzo, Las Vegas on December 2022.
00:00
· Introduction
00:40
· Machine learning
02:36
· Spam and scams
15:57
· Adversarial attacks
20:50
· Deep learning revolution
23:06
· Transformers
31:23
· Language models
37:09
· Zero-shot learning
42:16
· Prompt engineering
43:45
· Training costs and hardware
47:56
· Open contributions
51:26
· BERT and Stable Diffusion
54:42
· Tokenization
59:36
· Latent space
01:05:33
· Ethics
01:10:39
· Fine-tuning and pretrained models
01:18:43
· Textual inversion
01:22:46
· Dimensionality reduction
01:25:21
· Mission
01:27:34
· Advice for beginners
01:30:15
· Books and papers
01:34:17
· The lab notebook
01:44:57
· Thanks
From time to time, ask yourself why you're doing what you're doing.
Is it still fun?
What's something you could change?
Did it lose its magic?
You may not need to change anything, but often things turn into something different than what they were when we started.
Next week, we'll celebrate 100 YouTube live streams in a conversation with special guests on creative machine intelligence.
We'll use Riverside and stream live on YouTube, and we will release an edited version in the Getting Simple podcast.
We'll announce the guests that will join us soon. Stay tuned.
Here are three ways to define a React component in TypeScript which, in the end, are three ways to define a function in TypeScript—React components are JavaScript functions.
const MyComponent = (text: string) => <>{text}</>
const MyComponent = (text: string) => {
return (
<>{text}</>
)
}
function MyComponent(text: string) {
return <>{text}</>
}
In December, I recorded a two-hour conversation with Alex O'Connor in a hotel room at The Palazzo, Las Vegas, with two cameras and two microphones.
We discussed the recent developments of AI—transformers, prompt engineering, GPT, Stable Diffusion, tokenization, the latent space—but as Alex recently mentioned on LinkedIn, "Little did we know the blizzard of innovation that was coming."
Alex is a senior data science and machine learning manager who works building agile teams and impactful machine learning systems at Autodesk—one of the few people I know who understands what's going on in AI and ML and why it matters.
We'll premiere the episode on YouTube tomorrow, Wednesday, April 12, at 2 PM ET, which means you'll be able to chat with us during the episode.
It's exciting—we've never tried this format.
We hope you can make it!
import * as React from 'react'
import * as Server from 'react-dom/server'
let Greet = () => <h1>Hello, Nono!</h1>
console.log(Server.renderToString(<div><Greet /></div>))
// <div><h1>Hello, Nono!</h1></div>
The tricky part is running this code.
You first need to build it, say, with esbuild
, then execute it.
# Build with esbuild.
esbuild RenderToString.jsx --bundle --outfile=RenderToString.js
# Run with Node.js.
node RenderToString.js
# <div><h1>Hello, Nono!</h1></div>
No structure means freedom and flexibility, which demand that we make decisions regularly.
If you can do anything, what will you do?
Structure incorporates constraints that reduce the number of decisions required.
A clear workflow—a set of structured steps—limits the scope in which you’ll be creative and lets you forget about the process and focus on doing.
Here's how to deploy your Vite app to your local network so you can access it from other devices connected to the same WiFi. Say, your iPhone or iPad.
npx vite --host {local-ip-address}
If you're on macOS, you can simply run the following.
npx vite --host $(ipconfig getifaddr en0)
A fresh Vite project will likely have a dev
key in your package.json's scripts
property mapping that Yarn or NPM command to Vite, e.g., "dev": "vite"
so you can type yarn dev
or npm run dev
and have Vite run your application in development mode.
yarn dev
# VITE v4.2.1 ready in 165 ms
#
# ➜ Local: http://localhost:5173/
# ➜ Network: use --host to expose
# ➜ press h to show help
That's the same as running npx vite
or ./node_modules/.bin/vite
.
Before we can deploy to our IP address, we need to know what it is.
You can use ipconfing
on Windows and ifconfig
on macOS.
Henry Black shared a trick to get your Mac's local IP address with ifconfig
.
ipconfig getifaddr en0
# 192.168.1.34
All you need to do is pass your IP address as Vite's --host
argument.
npx vite --host $(ipconfig getifaddr en0)
# VITE v4.2.1 ready in 166 ms
#
# ➜ Local: http://192.168.1.34:5173/
# ➜ Network: http://192.168.1.34:5173/
# ➜ press h to show help
Now I can access my Vite app from other devices in the same network, which comes in handy if you want to test your app on other computers, phones, or tablets.
Remember, npx vite
is interchangeable with yarn dev
, npm run dev
, or ./node_modules/.bin/vite`.
For more information, read Vite's Server Options.
If you found this useful, let me know at @nonoesp!
Here's how to connect and communicate with WebSocket servers from browser client applications using the WebSocket API and the WebSocket protocol.
// Create a WebSocket client in the browser.
const ws = new WebSocket("ws://localhost:1234");
// Log incoming messages to the console.
ws.onmessage = function (event) {
// This runs when receiving message.
console.log(event.data);
};
ws.onopen = () => {
// This runs when we connect.
// Submit a message to the server
ws.send(`Hello, WebSocket! Sent from a browser client.`);
};
Note that if you restart your Droplet you may have to restart services that are running in the background manually.
# Restart the Droplet now
shutdown -r now
In my case, if Nginx doesn't restart automatically after the restart, I need to run the following commands.
sudo fuser -k 80/tcp && sudo fuser -k 443/tcp
sudo service nginx restart
Here's a one-liner to turn any website into dark mode.
body, img { filter: invert(0.92) }
I apply this to selected sites using Stylebot, a Chrome extension that lets you apply custom CSS to specific websites.
In a nutshell, the CSS inverts the entire website and then inverts images again to render them normally.
You can adjust the invert
filter's amount
parameter, which in the example is set to 0.92
.
0
would be no color inversion at all. 100
would be full-color inversion; whites turn black, and blacks turn white.
I often prefer to stay within 90–95% to reduce the contrast.
I began transcribing podcast episodes four years ago. I used Otter.ai, a transcription service that's grown into a product that transcribes meetings, captures slides, and generates summaries. Today, there are several machine-learning-based transcription offerings, some of which are free if you run them on your device.
We're almost at a point where artificial intelligence can transcribe large audio pieces without making mistakes.
I've used Descript1 (paid) heavily over the past few years and explored other alternatives. Descript transcribes and lets you edit audio and video by editing text. Transcriptions are highly accurate, but even a small error rate results in serious editing if you want transcripts of hours-long conversations to be correct, especially when using technical keywords and niche words. These tools let you provide sample text and a glossary of words in our audio to improve their accuracy.2
These workflows are still much better (and faster) than transcribing manually. If you don't have time to fix mistakes, you can often get away by adding a disclaimer that "transcripts were automatically generated and may contain errors."
In September 2022, OpenAI trained and open-sourced a neural network called Whisper that, in their own words, "approaches human level robustness and accuracy on English speech recognition." That's a big step. The community can extend Whisper and use it for free.3 (I've been using Whisper and played it with on Live 98.)
These systems can predict word-level timestamps—making it possible to highlight the exact word spoken at a given time—and perform something called speaker diarization, a fancy way to say that the AI knows who's talking when by identifying the active speaker.
Soon enough, transcripts will be commonplace. Transcripts will be free, automatic, and accurate; we'll expect them to be there.
Indeed, Spotify is already transcribing trending podcasts, and YouTube generates captions for every video. I imagine WhatsApp will transcribe voice notes so you can read them when you can't play them.
Transcripts are helpful to listeners to follow content, browse through long pieces, or refer to particular points of a conversation. But they also provide a way for editors to navigate episodes quickly and get an idea of their content, making it easier to edit by removing or moving blocks of audio around to make a conversation more fluid. They also help editors write episode descriptions, notes, and chapters, and machine learning is starting to do these tasks for us automatically—which is exciting.
Soon enough, we'll hit record, delegate all this manual labor to the machine4, and focus on our next piece of content when done.
Descript is a paid service. Their Pro subscription comes with thirty hours of monthly transcription. ↩
Descript has a Glossary of words for this purpose, and Whisper's command-line interface accepts a parameter called --initial_prompt
to provide text style and uncommon words. ↩
Whisper is also available as a cloud service that costs half a cent of a dollar for each minute of audio transcribed ($0.006/min). You can browse OpenAI's API Pricing here. ↩
OpenAI's GPT-3.5-turbo and GPT-4—the models behind ChatGPT—can perform these tasks. You may ask them to Summarize a text or Extract keywords and topics from a paragraph. OneAI already offers a service to extract relevant keywords and generate text summaries. ↩
After two months of pause, we're preparing to release new Getting Simple podcast episodes.
Editing and publishing add friction and delays to my process, so I'm exploring code and ML workflows to post-process of episodes' audio and generate transcripts, summaries & notes.
I'm not there yet. But OpenAI's Whisper (free) and Descript (paid) already provide accurate transcriptions. Existing projects and companies use #GPT-like language models to extract episode keywords, topics, chapters & summaries.
We'll soon have automatic episode notes.
It's exciting. I think we're getting very, very close.
I've also played with Spotify's pedalboard
Python package to post-process audio without relying on a Digital Audio Workstation (DAW).
That's cool because I can create reusable scripts for specific recording conditions and forget about audio editing — say, compressing, limiting, applying noise gates, or normalization—things you'd otherwise do in Adobe Audition.
Let me know if you'd like to see these automations in the live stream and video tutorials or shared here on Twitter at @nonoesp.
The point of writing as a human is to express ourselves. To pour words on paper (or the screen) and reflect on who you are, to learn, to evolve, and to inspire others. You can influence and inspire your future self as well.
Yesterday, I woke up and started the day writing five hundred words before I did any work. This is a practice I follow and will continue to follow. It doesn't make sense to delegate this to a machine because the whole point is to pour things out of my mind. Maybe this can turn into a conversation with an AI in the long run. I talk, we discuss, and my virtual assistant takes notes and generates a document instead of typing at my desk with a keyboard.
Machine intelligence is here to stay, and we'll find it harder to be original as it improves. But we must remember that they work because of all the knowledge humans have created before, with our mistakes and biases. Only they'll get better if we continue to produce original content. That may be a mistaken assumption, but I believe it in some way. AI originality is probably down the road, and current systems can hallucinate. But I like to think we'll do better work together with them. We must wait until everything is stable to identify which parts won't be done by humans anymore. Maybe they will but at a scary-fast pace.
Writing is a medium for creative expression, as are drawing, singing, film, photography, and many, many other forms. Get a pen and write—express yourself. Type with your fingers or thumbs. Shoot a video. Take a photo. Doodle. Tell us a story.
I've installed vnstat
on my M1 MacBook Pro with Homebrew to monitor my network usage over time.
# Install vnstat on macOS with Homebrew.
brew install vnstat
Make sure you start the vnstat
service with brew for vnstat to monitor your network usage.
brew services start vnstat
vnstat
will be running in the background, and you'll have to wait days for it to gather statistics and be able to show you, for instance, the average monthly usage.
› vnstat -m
# gif0: Not enough data available yet.
After a few minutes, you'll see stats on vnstat
.
› vnstat -5
# en0 / 5 minute
#
# time rx | tx | total | avg. rate
# ------------------------+-------------+-------------+---------------
# 2023-03-19
# 12:45 839.44 MiB | 2.60 MiB | 842.04 MiB | 23.55 Mbit/s
# 12:50 226.26 MiB | 306.00 KiB | 226.56 MiB | 46.35 Mbit/s
# ------------------------+-------------+-------------+---------------
› vnstat -h
# en0 / hourly
#
# hour rx | tx | total | avg. rate
# ------------------------+-------------+-------------+---------------
# 2023-03-19
# 12:00 1.04 GiB | 2.90 MiB | 1.04 GiB | 28.10 Mbit/s
# ------------------------+-------------+-------------+---------------
› vnstat -m
# en0 / monthly
#
# month rx | tx | total | avg. rate
# ------------------------+-------------+-------------+---------------
# 2023-03 1.04 GiB | 2.90 MiB | 1.04 GiB | 28.10 Mbit/s
# ------------------------+-------------+-------------+---------------
# estimated 3.43 TiB | 9.56 GiB | 3.44 TiB |
Puzzles embody the German word funktionlust.
Doing for the sake of doing.
Not pursuing an outcome. Just doing.
Here are my highlights from Works Containing Material Generated by Artificial Intelligence.
One such recent development is the use of sophisticated artificial intelligence (“AI”) technologies capable of producing expressive material.[5] These technologies “train” on vast quantities of preexisting human-authored works and use inferences from that training to generate new content. Some systems operate in response to a user's textual instruction, called a “prompt.” [6] The resulting output may be textual, visual, or audio, and is determined by the AI based on its design and the material it has been trained on. These technologies, often described as “generative AI,” raise questions about whether the material they produce is protected by copyright, whether works consisting of both human-authored and AI-generated material may be registered, and what information should be provided to the Office by applicants seeking to register them.
[I]n 2018 the Office received an application for a visual work that the applicant described as “autonomously created by a computer algorithm running on a machine.” [7] The application was denied because, based on the applicant's representations in the application, the examiner found that the work contained no human authorship. After a series of administrative appeals, the Office's Review Board issued a final determination affirming that the work could not be registered because it was made “without any creative contribution from a human actor.”
In February 2023, the Office concluded that a graphic novel [9] comprised of human-authored text combined with images generated by the AI service Midjourney constituted a copyrightable work, but that the individual images themselves could not be protected by copyright.
In the Office's view, it is well-established that copyright can protect only material that is the product of human creativity. Most fundamentally, the term “author,” which is used in both the Constitution and the Copyright Act, excludes non-humans.
[I]n the current edition of the Compendium, the Office states that “to qualify as a work of 'authorship' a work must be created by a human being” and that it “will not register works produced by a machine or mere mechanical process that operates randomly or automatically without any creative input or intervention from a human author.”
Individuals who use AI technology in creating a work may claim copyright protection for their own contributions to that work.
Applicants should not list an AI technology or the company that provided it as an author or co-author simply because they used it when creating their work.
docker run -it -p HOST_PORT:CONTAINER_PORT your-image
When you run services inside of Docker in specific ports, those are internal ports on the virtual container environment. If you want to connect to those services from your machine, you need to expose ports to the outside world explicitly. In short, you need to map TCP ports in the container to ports on the Docker host, which may be your computer. Here's how to do it.
Let's imagine we have a Next.js app running inside our Docker container.
› docker run -it my-app-image
next dev
# ready - started server on 0.0.0.0:3000, url: http://localhost:3000
The site is exposed to port 3000 of the container, but we can't access it from our machine at http://localhost:3000
.
Let's map the port.
› docker run -it -p 1234:3000 my-app-image
next dev
# ready - started server on 0.0.0.0:3000, url: http://localhost:3000
http://localhost:1234
1234
, Docker forwards the communication to port 3000
of the container