Nono.MA

MARCH 13, 2024

I was sad to see a redirect from Lobe.ai to Lobe's GitHub repository.

Thank you for the support of Lobe! The team has loved seeing what the community built with our application and appreciated your interest and feedback. We wanted to share with you that the Lobe desktop application is no longer under development.

The Lobe team open-sourced a lot of their tooling to use Lobe-trained models on the web, or with Python, .NET, and other platforms. Yet the Lobe app and website were never open-sourced, which means they will no longer be usable when they cease to work.

Before it's gone, you can access Lobe's site and download the latest app at aka.ms/DownloadLobe.

Lobe takes a new humane approach to machine learning by putting your images in the foreground and receding to the background, serving as the main bridge between your ideas and your machine learning model.

Lobe also simplifies the process of machine learning into three easy steps. Collect and label your images. Train and understand your results. Then play with your model and improve it.

I'd invite you to listen to my conversation with Adam Menges on the origins of Lobe.

FEBRUARY 26, 2024


How to run Google Gemma 2B- and 7B-parameter instruct models locally on the CPU and the GPU on Apple Silicon Macs.


See transcript ›

FEBRUARY 22, 2024

Performance Max campaigns serve across all of Google’s ad inventory, unlocking more opportunities for you to connect with customers.

[…]

[Google announced] several new features to help you scale and build high-quality assets — including bringing Gemini models into Performance Max.

[…]

Better Ad Strength and more ways to help you create engaging assets.

[A]dvertisers that use asset generation when creating a Performance Max campaign are 63% more likely to publish a campaign with Good or Excellent Ad Strength.

JANUARY 30, 2023

The Google Research team has published a paper for MusicLM, a machine learning model that generates high-fidelity music from text prompts, and it works extremely well. But they won't release it to the public, at least not yet.

You can browse and play through the examples to listen to results obtained by the research team for a wide variety of text-to-music tasks, including audio generation from rich captions, long generation, story mode, text and melody conditioning, painting caption conditioning, 10s audio generation from text, and generation diversity,

I'm particularly surprised by the text and melody conditioning examples, where a text prompt—say, "piano solo," "string quarter," or "tribal drums"—can be combined with a melody prompt—say "bella ciao - humming"—generating accurate results.

Even when they don't release the model, Google Research has publicly released MusicCaps to support future research, "a dataset composed of 5.5k music-text pairs, with rich text descriptions provided by human experts."

DECEMBER 16, 2022

According to OpenAI, "embeddings are numerical representations of concepts converted to number sequences, which make it easy for computers to understand the relationships between those concepts."

They introduced a new text and code embeddings API endpoint in January 25, 20221 capable of measuring the relatedness of text strings.

Here's a list of common uses of text embeddings, as listed in OpenAI's documentation.

  • Search (where results are ranked by relevance to a query string)
  • Clustering (where text strings are grouped by similarity)
  • Recommendations (where items with related text strings are recommended)
  • Anomaly detection (where outliers with little relatedness are identified)
  • Diversity measurement (where similarity distributions are analyzed)
  • Classification (where text strings are classified by their most similar label)

I look forward to testing this API on my writing to see how well it recommends, classifies, and clusters my mini-essays.

SEPTEMBER 22, 2022

OpenAI has open-sourced Whisper, a real-time speech transcription system.

"We’ve trained and are open-sourcing a neural net called Whisper that approaches human level robustness and accuracy on English speech recognition."

JULY 23, 2022

Two months ago, HuggingFace open-source "state-of-the-art diffusion models for image and audio generation in PyTorch" at github.com/huggingface/diffusers.

"Diffusers provides pretrained diffusion models across multiple modalities, such as vision and audio, and serves as a modular toolbox for inference and training of diffusion models."

Here's a text-to-image example from the repository's README.

# !pip install diffusers transformers
from diffusers import DiffusionPipeline

model_id = "CompVis/ldm-text2im-large-256"

# load model and scheduler
ldm = DiffusionPipeline.from_pretrained(model_id)

# run pipeline in inference (sample random noise and denoise)
prompt = "A painting of a squirrel eating a burger"
images = ldm([prompt], num_inference_steps=50, eta=0.3, guidance_scale=6)["sample"]

# save images
for idx, image in enumerate(images):
    image.save(f"squirrel-{idx}.png")

Latent diffusion is the type of model architecture used in Google's Imagen or OpenAI's DALL·E to generate images from text and increase the resolution of output images.

JULY 11, 2022


Here's a video in which I test if OpenAI's DALL-E can generate usable texture maps from an uploaded image.

This texture comes with one of Apple's project examples and the idea of generating textures with DALL-E came from Adam Watters on Discord.

JULY 4, 2022


OpenAI's DALL-E 2 creates variations of my hand sketches.


See transcript ›

JULY 3, 2022

I continue to play with DALL-E 2 from time to time. I've posted a few videos and live streams on the topic and plan to share more clips with tiny bits from my experiments and some of my favorite results so far. Tomorrow, a video sharing how DALL-E can copy my hand drawings will come out on YouTube.

JUNE 25, 2022


Here are my impressions of OpenAI's latest iteration of DALL·E, an AI system that generates images from text. I've generated images in different styles and variations of my drawings, experimented with public pages, mask edits, uploads, and more.


See transcript ›

Want to see older publications? Visit the archive.

Listen to Getting Simple .