Apple Silicon Machine Learning GPU Acceleration with Metal Performance Shaders (MPS) PyTorch's Backend

APRIL 26, 2024

I have an Apple M3 Max 14-inch MacBook Pro with 64 GB of Unified Memory (RAM) and 16 cores (12 performance and 4 efficiency).

It's awesome that PyTorch now supports Apple Silicon's Metal Performance Shaders (MPS) backend for GPU acceleration, which makes local inference and training much, much faster. For instance, each denoising step of Stable Diffusion XL takes ~2s with the MPS backend and ~20s on the CPU.

Resize Segmentation Masks with TensorFlow

SEPTEMBER 30, 2020

When manipulating semantic segmentation datasets, I found myself having to downsize segmentation masks without adding extra colors. If the image is cleanly encoded as a PNG, only the colors representing each of the classes contained in the label map will be present, and no antialias intermediate colors will exist in the image.

When resizing, though, antialias might add artifacts to your images to soften the edges, adding new colors that don't belong to any class in the label map. We can overcome this problem loading (or decoding) input images with TensorFlow as PNG and resizing our images with TensorFlow's NEAREST_NEIGHBOR resizing method.

(You can find a list of all TensorFlow's resize methods here, and an explanation of what each of them does here.)

import tensorflow as tf

# Read image file
img = tf.io.read_file('/path/to/input/image.png')

# Decode as PNG
img = tf.io.decode_png(
  img,
  channels=3,
  dtype=tf.uint8
)

# Resize using nearest neighbor to avoid adding new colors
# For that purpose, antialias is ignored with this resize method
img = tf.image.resize(
  img,
  (128, 128), # (width, height)
  antialias=False, # Ignored when using NEAREST_NEIGHBOR
  method=tf.image.ResizeMethod.NEAREST_NEIGHBOR
)

# Save the resize image back to PNG
tf.keras.preprocessing.image.save_img(
  '/path/to/output/image.png',
  img
)

YOLOv5 Performs Object Detection at 140 fps

SEPTEMBER 11, 2020

"Less than 50 days after the release YOLOv4, YOLOv5 improves accessibility for realtime object detection." Read the Roboflow post.

How to get started with Machine Learning

LAST UPDATED JUNE 24, 2024

Here are resources that are helping me get started with machine learning, and a few that I would have loved to have known about earlier. I'll probably be updating this page with new resources from time to time.

Stanford Cheat Sheets

A summary of terms, algorithms, and equations. (I barely understand the equations.=) These sheets, developed by Afshine and Shervine Amidi, differentiate between artificial intelligence (AI), machine learning (ML), and deep learning (DL) but many concepts overlap with each other. See this Venn diagram.

CS 221 · Artificial Intelligence Cheat Sheet
CS 229 · Machine Learning Cheat Sheet
CS 230 · Deep Learning Cheat Sheet

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems

I highly recommend this book I'm going through at the moment, written by an ex-Googler who worked in YouTube's video-classification algorithm. It's dense but it introduces you to all relevant artificial intelligence, machine learning, and deep learning concepts, and guides you through preparing custom datasets to train algorithms, a bit of data science I guess. At the same time, it introduces you to three of the most-used machine learning frameworks—Sci-Kit Learn, Keras, and TensorFlow being this theone I use on my day-to-day job developing and releasing machine learning models for production. Similar frameworks are Caffe or PyTorch, this one being used by Facebook developers. (Thanks to Keith Alfaro for the recommendation.)

Books

Understanding Machine Learning: From Theory to Algorithms. (Recommended by Viral Thakar.)
Regression and Other Stories. Chapter 9 covers Prediction and Bayesian inference. (Thanks to Saeran Vasanthakumar for the recommendation.)

Open-source code and tutorials

I got started with machine learning by trying open-source algorithms. It's common to visit the GitHub repository corresponding to a paper and give it a try. Two examples are Pix2Pix (2016) and EfficientDet (2020). You try to use their code as is, then try to use a custom dataset for training and see how the model performs for your needs.

TensorFlow re-writes many of these models and makes easy-to-follow tutorials.

Pix2Pix in TensorFlow Core - Made by the Google TensorFlow team, this tutorial offers you to View the code on GitHub, Download the Jupyter Notebook (written in Python) or Run the Notebook in Google Colab (where you can press a button in the cloud and see how each piece of Python code runs to understand the different parts of setting up and training an algorithm. Reading the dataset, peparing the training and validation set, creating the model, training it, and more).
TensorFlow tutorials - This is a good place to get your hands dirty. While machine learning has a strong theoretical component you can leave that aside and start by training and testing models for image classification, object detection, semantic image segmentation, and a lot more tasks.

Friendly user interfaces

Runway - A friend of mine, Cristóbal Valenzuela, is building his own machine learning platform for creatives. It's the place for people who don't know how to code (or don't want to) to be able to use complex machine learning models, training them with custom data and deploying them to the cloud. Here's an interview where he told me about the beginnings of Runway.
Machine Learning for Designers Talk - I gave a talk about these types of interfaces, a few projects, and the role they play for designers and people who don't know how to code.

Courses

Stanford CS231n: Convolutional Neural Networks. A YouTube playlist with the entire course. Helps understand the basic operations in modern convolutional neural networks (or ConvNets). How they work; what they are useful for; and how to work with them.
Deep Learning with PyTorch - This course, by Yann LeCun and Alfredo Canziani at NYU, concerns the latest techniques in deep learning and representation learning. (Recommended by @arafaelpastrana.)
Practical Deep Learning for Coders - Haven't followed this course yet. But Adam Menges highly recommended it to me to get started. (Alex Huang pointed out there's also a print book that follows along with the course: Deep Learning for Coders with fastai and PyTorch: AI Applications Without a PhD.)

Tutorials & live streams

Machine Intelligence YouTube playlist. Here is a compilation of some of the machine-intelligence-related video tutorials I've recorded.
Live Streams YouTube playlist. Weekly hands-on coding sessions on creative coding and machine intelligence for designers. From conceptual overviews to hands-on neural network architecture, automation, training, or cloud deployment.

Other resources

TensorFlow: Tensor and Image Basics - A video with basic tensor and image operations in TensorFlow. How to use tensors to encode images and matrices and visualize them.
TensorFlow: Visualizing Convolutions - A video to visualize the filters of an image convolution, an operation known for its ability to extract image features in an unsupervised way to perform classification tasks used in convolutional neural networks.
Awesome Machine Learning - A big and frequently-updated list of machine learning resources.
Suggestive Drawing - This is my Harvard's masters thesis, in which I explore how the collaboration between human and artificial intelligences can enhance the design process.

Found this post useful?

Share it with your friends. (Short URL is nono.ma/ml.)
Let me know of resources I should add to this page.

Digital Futures 2020: A Virtual, Non-Technical Workshop on Machine Intelligence in Architecture

JUNE 27, 2020

Well, this is happening today.

Even though our workshop is listed in the North-South Americas Workshops page, I'm tuning in from Málaga, Spain, where I live and work, remotely.

Jose Luis, Nate, and guest speakers will be joining from the US. Those include Elizabeth Christoforetti & Romy El Sayah, Ao Li, Runjia Tian, Xiaoshi Wang & Yueheng Lu, and Andrew Witt.

The format of our workshop has been widely adopted by numerous organizations as an alternative to the cancelation of on-site conferences, workshops, and other gatherings.

Zoom conference rooms miss many of the nuances present in in-person events, yet I feel they enable a new kind of interaction in which people who wouldn't have been able to cross the Atlantic are now a click away from hopping into a live conference with us. (No need to book plane tickets and accommodation, and seats don't necessarily need to be limited.)

As suggested by Jose Luis, ours are a series of non-technical lectures and demos. We've organized a one-day workshop in which we'll share our views on the role of machine intelligence in architecture, art, and design, commenting on state-of-the-art projects, tools, and machine learning models that are here to stay with us.

While preparing this workshop, I recorded two technical, hands-on coding tutorials as I was building the Pix2Pix & RunwayML drawing app we'll showcase today, using Glitch, Paper.js, RunwayML, and Pix2Pix, among other technologies. (I've published Part 1 and Part 2 so far.)

Visit our workshop page to see the most up-to-date schedule.

I hope you'll join us.

Stay in touch for future events.

Machine Learning (and TensorFlow) Glossary

JUNE 8, 2020

Just came across this machine learning (and TensorFlow) glossary which "defines general machine learning terms, plus terms specific to TensorFlow."

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition

MAY 14, 2020

I recently got Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition by Aurélien Géron as a recomendation from Keith.

This second version updates all code samples to work with TensorFlow 2, and the repository that accompanies the book—ageron/handson-ml2—is also updated frequently to catch up with the latest updates.

Just the Python notebooks on that GitHub repository are super helpful to get an overall on state-of-the-art machine learning and deep learning techniques, from the basics of machine learning and classic techniques like classification, support vector machines, or decision trees to the latest techniques to code neural networks, customizing and trained them, loading and pre-processing data, natural language processing, computer vision, autoencoders and gans, or reinforcement learning.

#Graph2Plan

MAY 13, 2020

Nice work from Shenzhen, Carleton, and Simon Fraser Universities, titled Graph2Plan: Learning Floorplan Generation from Layout Graphs, along the lines of #HouseGAN. Via @alfarok.

Our deep neural network Graph2Plan is a learning framework for automated floorplan generation from layout graphs. The trained network can generate floorplans based on an input building boundary only (a-b), like in previous works. In addition, we allow users to add a variety of constraints such as room counts (c), room connectivity (d), and other layout graph edits. Multiple generated floorplans which fulfill the input constraints are shown.

Read the paper on Arxiv.

In-Domain GAN Inversion for Real Image Editing

MAY 10, 2020

We propose In-Domain GAN inversion (IDInvert) by first training a novel domain-guided encoder which is able to produce in-domain latent code, and then performing domain-regularized optimization which involves the encoder as a regularizer to land the code inside the latent space when being finetuned. The in-domain codes produced by IDInvert enable high-quality real image editing with fixed GAN models.

RunwayML Launches Hosted Models (and a JavaScript SDK)

MAY 5, 2020

Connect directly to RunwayML models with only a few lines of code to build web apps, chatbots, plugins, and more. Hosted Models live on the web and can be used anytime, anywhere, without requiring RunwayML to be open!

[…]

We've also released a JavaScript SDK alongside the new Hosted Models feature. Use it to bring a Hosted Model to your next project in just 3 lines of code.

SketchRNN Flowchart

APRIL 19, 2020

David Ha trained SketchRNN with a flowchart dataset. You can test his live demo (mobile friendly) and his multi-prediction demo (not mobile-friendly).

The source code is available on GitHub.

#HouseGAN

APRIL 1, 2020

Meet #HouseGAN: A relational generative adversarial network for graph-constrained house layout generation by Nelson Nauata & Chin-Yi Cheng at Autodesk Research in 2019.

Generate house layouts from architectural constraint graphs.

📝Paper → https://arxiv.org/abs/2003.06988
🧠Code → Coming soon!

This paper proposes a novel graph-constrained generative adversarial network, whose generator and discriminator are built upon relational architecture. The main idea is to encode the constraint into the graph structure of its relational networks. We have demonstrated the proposed architecture for a new house layout generation problem, whose task is to take an architectural constraint as a graph (i.e., the number and types of rooms with their spatial adjacency) and produce a set of axis-aligned bounding boxes of rooms. We measure the quality of generated house layouts with the three metrics: the realism, the diversity, and the compatibility with the input graph constraint. Our qualitative and quantitative evaluations over 117,000 real floor plan images demonstrate that the proposed approach outperforms existing methods and baselines. We will publicly share all our code and data.

Machine Learning for Designers at Fresh Eyes Berlin

OCTOBER 15, 2019

A talk about machine learning, design, and creativity, at University of Arts Berlin on September 21, 2019, during the Fresh Eyes workshop at the Design Modelling Symposium conference.

Lobe: Deep Learning Made Simple

SEPTEMBER 26, 2018

Lobe is a web-based visual programming language to create and deploy machine learning models, founded in 2015 by Mike Matas, Adam Menges, and Markus Beissinger "to make deep learning accessible to everyone," recently acquired by Microsoft.

Lobe is an easy-to-use visual tool that lets you build custom deep learning models, quickly train them, and ship them directly in your app without writing code.

I saw a live demo at SmartGeometry earlier this year and I can't wait to play with it once its deployed on Microsoft's servers.

You can see a few examples at Lobe.ai. (They're looking for people to join their team.)

Watch this video to see examples of things people have built using Lobe and how to build your own custom deep learning models.

Mind Ex Machina Projects at SmartGeometry 2018

JUNE 27, 2018

Last month, Jose Luis García del Castillo y López (@garciadelcast) and myself (@nonoesp) had the opportunity to lead the Mind Ex Machina cluster at SmartGeometry¹ 2018. (Watch on YouTube.)

This talk summarizes the projects that came out of our workshop, which intended to explore the possibilities of robot-human-ai interactions with the use of machine learning libraries and the Machina² robotic control framework.

The SmartGeometry workshops and conferences were hosted in May 7–12, 2018, at the John H. Daniels Faculty of Architecture, Landscape, and Design at University of Toronto, Canada. The Mind Ex Machina cluster worked most of the time at the Autodesk Toronto Technology Office, located in the MaRS Discovery District.

I'm extremely thankful to Marc Webb for the following video, which provides a bit more insight on the things we worked on. (Watch on Vimeo.)

Shout-out to the impressive work of other clusters such as Fresh Eyes and Data Mining the City. See all of the videos here.

I think the whole group had a blast working on these projects—thanks! You can find notes and source code of the projects on GitHub (especially, in this repository).

🧠x🤖

SmartGeometry is a bi-annual workshop and conference, this year entitled sg2018: Machine Minds, at the University of Toronto, Canada, from 7th-12th May 2018. The sg2018 workshop and conference is a gathering of the global community of innovators and pioneers in the fields of architecture, design and engineering. ↩
Machina is an open-source project for action-based real-time control of mechanical actuators or, in more human terms, it allows you to talk to a robot and tell it what to do. Machina is developed and maintained by Jose Luis García del Castillo y López. ↩

It's Not Machine Learning, It's an Artist Sketching by Hand

MARCH 20, 2018

Hey there! Until April 30, 2018, you can Participate to get a free portrait of a photo of your choosing. Just sign up with your e-mail and—optionally—upload the picture you would like to get a portrait of (or you can also do this later if you happen to be the winner). Enter at lourdes.ac/contest.

This looping series of photo2portrait pairs reminds me of the training sets you would need to feed certain machine learning algorithms (such as pix2pix) for them to learn how to generate an output image from an input image. If we were to train a neural network with photo2portrait pairs, the algorithm would try to learn how to generate a pencil portrait from a picture. (A training set for this purpose would, probably, need hundreds of photo2portrait pairs though.)

Suggestive Drawing, flowers with pix2pix.

I generated multiple training sets for pix2pix for Suggestive Drawing, my thesis project at Harvard GSD. Most of the models I trained tried to generate a texture for a hand drawing. Flowers — in special daisies, sunflowers, roses, and tulips — happened to work particularly well.

It's a hand-sketched portrait.

Even though machine intelligence is enabling new modes of generating artwork, but the portraits you see here are (still) sketched on pencil by an artist—my mom—on paper. (As a curiosity, here is a looping gif of some of her many portraits of myself.)

Nono Martínez Alonso (Nono.ma) portrait by Lourdes Alonso Carrión (Lourdes.ac).

We are currently accepting registrations for a raffle; You can win a free portrait! Just Enter the competition and you'll be able to select a photo later (of yourself, a relative, a friend, or even your pet). You can See all of her portraits on my mom's website.

Thanks so much and good luck!

Visualize an Object Classification Network with Crow for Grasshopper

FEBRUARY 19, 2018

Want to see older publications? Visit the archive.

Listen to Getting Simple .