This site with Machine Learning Challenges (deep-ml.com) looks really promising to learn about foundational concepts.
I export videos from Descript, which has embedded subtitles, and Descript doesn't have a way to export subtitles by chapter markers; it only exports them for an entire composition.
Here's a command that extracts the embedded subtitles from a given video—and supports any format supported by FFmpeg, such as MP4, MOV, or MKV.
ffmpeg -i video.mp4 -map 0:s:0 subtitles.srt
Here's what each part of the command does.
-i video.mp4
- the input file.map 0:s:0:
- maps the first subtitle track found in the video. (You can change the last digit to extract a different track, e.g., 0:s:1
for the second subtitle track.)subtitles.srt
- the output file name and format, e.g, SRT
or VTT
.If you found this useful, let me know!
First, you must install Rust in your machine, which comes with Cargo. (Installing Rust with rustc
will also install Cargo.)
cargo new my_package
cd my_package
cargo build
cargo run
Edit Cargo.toml
and add your package dependencies.
// Cargo.toml
[dependencies]
base64 = "0.21.7"
You can browse Cargo packages at crates.io.
See how to compile and run Rust programs without Cargo.
Here are two simple Rust programs and how to compile and run them in macOS with rustc
.
// first.rs
fn main() {
println!("Hello, World!");
}
You can compile this program with rustc
.
rustc first.rs
Then run it.
./first
# Hello, World!
I generated this program with ChatGPT and then modified it.
// count.rs
use std::fs::File;
use std::io::{self, BufRead};
use std::path::Path;
use std::thread;
fn count_words_in_file(file_path: &Path) -> io::Result<usize> {
let file = File::open(file_path)?;
let reader = io::BufReader::new(file);
let mut count = 0;
for line in reader.lines() {
let line = line?;
count += line.split_whitespace().count();
}
Ok(count)
}
fn main() {
let file_paths = vec!["file1.txt", "file2.txt", "file3.txt"]; // Replace with actual file paths
let mut handles = vec![];
for path in file_paths {
let path = Path::new(path);
let handle = thread::spawn(move || {
match count_words_in_file(path) {
Ok(count) => println!("{} has {} words.", path.display(), count),
Err(e) => eprintln!("Error processing file {}: {}", path.display(), e),
}
});
handles.push(handle);
}
for handle in handles {
handle.join().unwrap();
}
}
We then build and run as before, assuming we have three text files (file1.txt
, file2.txt
, and file3.txt
) in the same directory of our program.
rustc count.rs
./count
# file2.txt has 45 words.
# file3.txt has 1324 words.
# file1.txt has 93980 words.
I got a Permission denied
error when trying to cargo install
.
› cargo install cargo-wasm
Updating crates.io index
Downloaded cargo-wasm v0.4.1
error: failed to download replaced source registry `crates-io`
Caused by:
failed to create directory `/Users/nono/.cargo/registry/cache/index.crates.io-6f17d22bba15001f`
Caused by:
Permission denied (os error 13)
This is how I fixed it.
sudo chown -R $(whoami) /Users/nono/.cargo
I then tried to cargo wasm setup
and got this error.
› cargo wasm setup
info: syncing channel updates for 'nightly-aarch64-apple-darwin'
error: could not create temp file /Users/nono/.rustup/tmp/628escjb9fzjn4mu_file: Permission denied (os error 13)
Failed to run rustup. Exit code was: 1
Which, again, was solved by changing the owner of ~/.rustup
to myself.
sudo chown -R $(whoami) /Users/nono/.rustup
Say we have a TypeScript interface with required and optional values.
interface NonosOptions {
thickness: number,
pressure?: number
}
thickness
is required, pressure
is optional.
If we create an object of type NonosOptions
, we can omit pressure
but not thickness
.
const options: NonosOptions = {
thickness: 1.5
}
We can now deconstruct our options
with a default pressure value, which will only be used if options
doesn't define a value.
const { thickness = 2, pressure = 0.75 } = options
// thickness = 1.5
// pressure = 0.75
As you can see, thickness
ignores the 2
assignment because options
sets it as 1.5
.
But pressure
is set to 0.75
because options
doesn't define a pressure
value.
If pressure
is defined in options
, both thickness
and pressure
deconstruction fallback values would be ignored.
const options: NonosOptions = {
thickness: 1.5,
pressure: 0.25
}
const { thickness = 2, pressure = 0.75 } = options
// thickness = 1.5
// pressure = 0.25
If you're trying to run a Bash script and get a Permission Denied
error, it's probably because you don't have the rights to execute it.
Let's check that's true.
# Get the current file permissions.
stat -f %A script.sh
# 644
With 644
, the user owner can read and write but not execute.1
Set the permissions to 755
to fix the issue.
chmod 755 script.sh
Even though Vite doesn't like chunks larger than 500 kBs after minification, you can increase the kB limit. Remember, this is just a warning, not an error.
An alternative solution is to chunk your JavaScript bundle into separate chunks, known as chunking. You can do this with the vite-plugin-chunk-split package.
I've been doing a lot of React and TypeScript work lately. It had been a few years since I worked with them on a daily basis, and things are improving a lot. It's a breeze to work with some of these technologies to build web apps, and one of the newest additions that works well is Vite.
Is anyone else working with React these days? I will cover some of my learnings on YouTube and want to get a sense of interest. (Let me know on Discord.)
What's cool is that frameworks such as ONNX and TensorFlow have wide support to export and run models in the browser (web workers, WebGPU, WebAssembly) and you don't even need to build microservices for certain models. (Plus there's now support for Node.js to run in the browser as well!)
I just started a new daily file with my &ndaily
Typinator text expansion.
This expansion archives my current daily file, an action I run whenever a daily file goes over seven thousand words.
It then creates a new file named —01_Daily_Part_94.md
for Daily 94.
A few weeks ago, I paid for a Typinator 9 upgrade. The app is more modern, has light and dark modes, and promises long-term support. I'm glad they did that.
I'm a heavy user of Typinator and, someday, I'll create a list of all the things I use on a daily basis.
One of my most-used expansions—they've added usage stats (!)—is dtt
, which would expand, today, to 230531.
No matter how small a piece of software is, it requires maintenance.
Except when it doesn't, which is true for certain programs without external dependencies and deprecated features.
The more code bases you rely on or develop, the more tiny efforts you'll have to put here and there to keep them running, especially if you want to keep the operating system up to date.
Here are three ways to define a React component in TypeScript which, in the end, are three ways to define a function in TypeScript—React components are JavaScript functions.
const MyComponent = (text: string) => <>{text}</>
const MyComponent = (text: string) => {
return (
<>{text}</>
)
}
function MyComponent(text: string) {
return <>{text}</>
}
import * as React from 'react'
import * as Server from 'react-dom/server'
let Greet = () => <h1>Hello, Nono!</h1>
console.log(Server.renderToString(<div><Greet /></div>))
// <div><h1>Hello, Nono!</h1></div>
The tricky part is running this code.
You first need to build it, say, with esbuild
, then execute it.
# Build with esbuild.
esbuild RenderToString.jsx --bundle --outfile=RenderToString.js
# Run with Node.js.
node RenderToString.js
# <div><h1>Hello, Nono!</h1></div>
Here's how to connect and communicate with WebSocket servers from browser client applications using the WebSocket API and the WebSocket protocol.
// Create a WebSocket client in the browser.
const ws = new WebSocket("ws://localhost:1234");
// Log incoming messages to the console.
ws.onmessage = function (event) {
// This runs when receiving message.
console.log(event.data);
};
ws.onopen = () => {
// This runs when we connect.
// Submit a message to the server
ws.send(`Hello, WebSocket! Sent from a browser client.`);
};
Note that if you restart your Droplet you may have to restart services that are running in the background manually.
# Restart the Droplet now
shutdown -r now
In my case, if Nginx doesn't restart automatically after the restart, I need to run the following commands.
sudo fuser -k 80/tcp && sudo fuser -k 443/tcp
sudo service nginx restart
Here's a one-liner to turn any website into dark mode.
body, img { filter: invert(0.92) }
I apply this to selected sites using Stylebot, a Chrome extension that lets you apply custom CSS to specific websites.
In a nutshell, the CSS inverts the entire website and then inverts images again to render them normally.
You can adjust the invert
filter's amount
parameter, which in the example is set to 0.92
.
0
would be no color inversion at all. 100
would be full-color inversion; whites turn black, and blacks turn white.
I often prefer to stay within 90–95% to reduce the contrast.
I've installed vnstat
on my M1 MacBook Pro with Homebrew to monitor my network usage over time.
# Install vnstat on macOS with Homebrew.
brew install vnstat
Make sure you start the vnstat
service with brew for vnstat to monitor your network usage.
brew services start vnstat
vnstat
will be running in the background, and you'll have to wait days for it to gather statistics and be able to show you, for instance, the average monthly usage.
› vnstat -m
# gif0: Not enough data available yet.
After a few minutes, you'll see stats on vnstat
.
› vnstat -5
# en0 / 5 minute
#
# time rx | tx | total | avg. rate
# ------------------------+-------------+-------------+---------------
# 2023-03-19
# 12:45 839.44 MiB | 2.60 MiB | 842.04 MiB | 23.55 Mbit/s
# 12:50 226.26 MiB | 306.00 KiB | 226.56 MiB | 46.35 Mbit/s
# ------------------------+-------------+-------------+---------------
› vnstat -h
# en0 / hourly
#
# hour rx | tx | total | avg. rate
# ------------------------+-------------+-------------+---------------
# 2023-03-19
# 12:00 1.04 GiB | 2.90 MiB | 1.04 GiB | 28.10 Mbit/s
# ------------------------+-------------+-------------+---------------
› vnstat -m
# en0 / monthly
#
# month rx | tx | total | avg. rate
# ------------------------+-------------+-------------+---------------
# 2023-03 1.04 GiB | 2.90 MiB | 1.04 GiB | 28.10 Mbit/s
# ------------------------+-------------+-------------+---------------
# estimated 3.43 TiB | 9.56 GiB | 3.44 TiB |
docker run -it -p HOST_PORT:CONTAINER_PORT your-image
When you run services inside of Docker in specific ports, those are internal ports on the virtual container environment. If you want to connect to those services from your machine, you need to expose ports to the outside world explicitly. In short, you need to map TCP ports in the container to ports on the Docker host, which may be your computer. Here's how to do it.
Let's imagine we have a Next.js app running inside our Docker container.
› docker run -it my-app-image
next dev
# ready - started server on 0.0.0.0:3000, url: http://localhost:3000
The site is exposed to port 3000 of the container, but we can't access it from our machine at http://localhost:3000
.
Let's map the port.
› docker run -it -p 1234:3000 my-app-image
next dev
# ready - started server on 0.0.0.0:3000, url: http://localhost:3000
http://localhost:1234
1234
, Docker forwards the communication to port 3000
of the container
You can upload Shorts to YouTube with the YouTube API as you would upload any other video. Simply ensure your video has an aspect ratio of 9:16 and is less than 60 seconds. YouTube will automatically set it as a Short.
Follow this guide to see how to upload videos to YouTube with the YouTube API.
Here's how to define simple async
functions in TypeScript.
(async (/*arguments*/) => {/*function logic*/})(/*values*/);
// Define an asynchronous function.
const helloAsync = async() => { console.log("Hey, Async!"); }
// Call it asynchronously.
helloAsync();
(async(text: string) => { console.log(text); })("Hello, Async!")
(async(text: string) => { setTimeout(() => console.log(text), 2000); })("Hello, Async!")
// Say we have an async talk() function that logs text to the console.
const talk = async(text: string) => { console.log(text); }
// And a sleep() function that uses a Promise to wait for milliseconds.
const sleep = (ms: number) => {
return new Promise(resolve => setTimeout(resolve, ms));
}
// We can wrap calls to async functions in an async function.
// Then `await` to execute them synchronously.
(async () => {
await talk(`Hello!`);
await sleep(1000);
await talk(`What's up?`);
await sleep(2000);
await talk(`Bye now!`);
})();
Here's how to list the commits that happened between two tags.
git log --pretty=oneline 0.8.0...0.9.0
The two tags—in this case, 0.8.0
and 0.9.0
—need to exist.
You can list existing tags in a repository as below.
git tag
You can list what packages are installed globally in your system with npm -g list
—shorthand for npm --global list
—whereas you'd list the packages installed in an NPM project with npm list
.
Let's see an example of what the command might return.
npm -g list
# /opt/homebrew/lib
# ├── cross-env@7.0.3
# ├── http-server@14.1.1
# ├── node-gyp@9.3.1
# ├── npm@9.5.0
# ├── pm2@5.2.2
# ├── spoof@2.0.4
# ├── ts-node@10.9.1
# └── typescript@4.9.5
Here are some of the commands we used during the Creative Machine Learning Live 97.
First, create an Anaconda environment or install in your Python install with pip
.
pip install imaginairy
Before running the commands below, I entered an interactive imaginAIry shell.
aimg
🤖🧠> # Commands here
# Upscale an image 4x with Real-ESRGAN.
upscale image.jpg
# Generate an image and animate the diffusion process.
imagine "a sunflower" --gif
# Generate an image and create a GIF comparing it with the original.
imagine "a sunflower" --compare-gif
# Schedule argument values.
edit input.jpg \
--prompt "a sunflower" \
--steps 21 \
--arg-schedule "prompt_strength[6:8:0.5]" \
--compilation-anim gif
Here's how to add NuGet packages from a local source to your Visual Studio project.
local-nugets
).Tools > Options > NuGet Package Manager > Package Sources
.Add
button (the green cross) to create a new Package Source....
) to browse and select the folder you previously created -- local-nugets
in my case -- and then click on Update
.local-nugets
folder, and everything left is to install the package as follows.Project > Manage NuGet Packages > Browse
.Install
.
Here's how to randomize a list of strings in bash.
On macOS, you can use Terminal or iTerm2.
The shuf
command shuffles a list that is "piped" to it.
An easy way to do that is to list a directory's contents with ls
and then shuffle them.
ls ~/Desktop | shuf
The easiest way to shuffle a set of strings is to define an array in bash and shuffle it with shuf
.
WORDS=('Milk' 'Bread' 'Eggs'); shuf -e ${WORDS[@]}
You can use pbcopy
to copy the shuffled list to your clipboard.
WORDS=('Milk' 'Bread' 'Eggs' ); shuf -e ${WORDS[@]} | pbcopy
Another way to randomize a list of strings from bash is to create a text file, in this case named words.txt
, with a string value per line.
Bread
Milk
Chicken
Turkey
Eggs
You can create this file manually or from the command-line with the following command.
echo "Bread\nMilk\nChicken\nTurkey\nEggs" > words.txt
Then, we cat
the contents of words.txt
and shuffle order of the lines with shuf
.
cat words.txt | shuf
# Eggs
# Milk
# Chicken
# Turkey
# Bread
Again, you can save the result to the clipboard with pbcopy
.
cat words.txt | shuf | pbcopy
If you found this useful, let me know!
Here's a Python class that can track and push metrics to AWS CloudWatch.
Metrics are reset to their initial values on creation and when metrics are uploaded to CloudWatch.
# metrics.py
'''
A metrics class ready to track and push metrics to AWS CloudWatch.
'''
from datetime import datetime
import os
import boto3
# CloudWatch metrics namespace.
METRICS_NAMESPACE = 'my_metrics_namespace'
# Duration to wait between metric uploads.
METRICS_UPLOAD_THRESHOLD_SECONDS = 50
class Metrics:
'''
Holds metrics, serializes them to CloudWatch format,
and ingests foreign metric values.
'''
def __init__(self):
self.reset()
def reset(self):
'''
Resets metric values and last upload time.
'''
self.last_upload_time = datetime.now()
# Your custom metrics and initial values
# Note that here we're using 'my_prefix' as
# a custom prefix in case you want this class
# to add a prefix namespace to all its metrics.
self.my_prefix_first_metric = 0
self.my_prefix_second_metric = 0
def to_data(self):
'''
Serializes metrics and their values.
'''
def to_cloudwatch_format(name, value):
return {'MetricName': name, 'Value': value}
result = []
for name, value in vars(self).items():
if name != 'last_upload_time':
result.append(to_cloudwatch_format(name, value))
return result
def ingest(self, metrics, prefix=''):
'''
Adds foreign metric values to this metrics object.
'''
input_metric_names = [attr for attr in dir(metrics)
if not callable(getattr(metrics, attr))
and not attr.startswith("__")]
# Iterate through foreign keys and add metric values.
for metric_name in input_metric_names:
# Get value of foreign metric.
input_metric_value = getattr(metrics, metric_name)
# Get metric key.
metric_key = f'{prefix}_{metric_name}'
# Get metric value.
metric_value = getattr(self, metric_key)
# Add foreign values to this metrics object.
setattr(
self,
metric_key,
input_metric_value + metric_value
)
def upload(self, force=False):
'''
Uploads metrics to CloudWatch when time since last
upload is above a duration or when forced.
'''
# Get time elapsed since last upload.
seconds_since_last_upload = \
(datetime.now() - self.last_upload_time).seconds
# Only upload if duration is greater than threshold,
# or when the force flag is set to True.
if seconds_since_last_upload > 50 or force:
# Upload metrics to CloudWatch.
cloudwatch = boto3.client(
'cloudwatch',
os.getenv('AWS_REGION')
)
cloudwatch.put_metric_data(
Namespace=METRICS_NAMESPACE,
MetricData=self.to_data()
)
# Reset metrics.
self.reset()
To use this class, we just have to instantiate a metrics object, track some metrics, and upload them.
# Create a metrics object.
metrics = Metrics()
# Add values to its metrics.
metrics.my_prefix_first_metric += 3
metrics.my_prefix_second_metric += 1
# Upload metrics to CloudWatch.
metrics.upload(force=True)
If you were processing metrics at a fast pace, you don't want to upload metrics every single time you increase their value, as otherwise CloudWatch will complain. In certain cases, AWS CloudWatch's limit is 5 transactions per second (TPS) per account or AWS Region. When this limit is reached, you'll receive a RateExceeded throttling error.
By calling metrics.upload(force=False)
we only upload once every METRICS_UPLOAD_THRESHOLD_SECONDS
. (In this example, at maximum every 50 seconds.)
import time
# Create a metrics object.
metrics = Metrics()
for i in range(0, 100, 1):
# Wait for illustration purposes,
# as if we were doing work.
time.sleep(1)
# Add values to its metrics.
metrics.my_prefix_first_metric += 3
metrics.my_prefix_second_metric += 1
# Only upload if more than the threshold
# duration has passed since we last uploaded.
metrics.upload()
# Force-upload metrics to CloudWatch once we're done.
metrics.upload(force=True)
Lastly, here's how to ingest foreign metrics with or without a prefix.
# We define a foreign metrics class.
class OtherMetrics:
def __init__(self):
self.reset()
def reset(self):
# Note that here we don't have 'my_prefix'.
self.first_metric = 0
self.second_metric = 0
# We instantiate both metric objects.
metrics = Metrics()
other_metrics = OtherMetrics()
# The foreign metrics track values.
other_metrics.first_metric += 15
other_metrics.second_metric += 3
# Then our main metrics class ingests those metrics.
metrics.ingest(other_metrics, prefix='my_prefix')
# Then our main metrics class has those values.
print(metrics.my_prefix_first_metric)
# Returns 15
print(metrics.my_prefix_second_metric)
# Returns 3
If you found this useful, let me know!
Take a look at other posts about code, Python, and Today I Learned(s).
Here's how to sort a Python dictionary by a key, a property name, of its items. Check this post if you're looking to sort a list of lists instead.
# A list of people
people = [
{'name': 'Nono', 'age': 32, 'location': 'Spain'},
{'name': 'Alice', 'age': 20, 'location': 'Wonderland'},
{'name': 'Phillipe', 'age': 100, 'location': 'France'},
{'name': 'Jack', 'age': 45, 'location': 'Caribbean'},
]
# Sort people by age, ascending
people_sorted_by_age_asc = sorted(people, key=lambda x: x['age'])
print(people_sorted_by_age_asc)
# [
# {'name': 'Alice', 'age': 20, 'location': 'Wonderland'},
# {'name': 'Nono', 'age': 32, 'location': 'Spain'},
# {'name': 'Jack', 'age': 45, 'location': 'Caribbean'},
# {'name': 'Phillipe', 'age': 100, 'location': 'France'}
# ]
# Sort people by age, descending
people_sorted_by_age_desc = sorted(people, key=lambda x: -x['age'])
print(people_sorted_by_age_desc)
# [
# {'name': 'Phillipe', 'age': 100, 'location': 'France'},
# {'name': 'Jack', 'age': 45, 'location': 'Caribbean'},
# {'name': 'Nono', 'age': 32, 'location': 'Spain'},
# {'name': 'Alice', 'age': 20, 'location': 'Wonderland'}
# ]
# Sort people by name, ascending
people_sorted_by_name_desc = sorted(people, key=lambda x: x['name'])
print(people_sorted_by_name_desc)
# [
# {'name': 'Alice', 'age': 20, 'location': 'Wonderland'},
# {'name': 'Jack', 'age': 45, 'location': 'Caribbean'},
# {'name': 'Nono', 'age': 32, 'location': 'Spain'},
# {'name': 'Phillipe', 'age': 100, 'location': 'France'}
# ]
You can measure the time elapsed during the execution of Python commands by keeping a reference to the start
time and then subtracting the current
time at any point on your program from that start
time to obtain the duration between two points in time.
from datetime import datetime
import time
# Define the start time.
start = datetime.now()
# Run some code..
time.sleep(2)
# Get the time delta since the start.
elapsed = datetime.now() - start
# datetime.timedelta(seconds=2, microseconds=005088)
# 0:00:02.005088
# Get the seconds since the start.
elapsed_seconds = elapsed.seconds
# 2
Let's create two helper functions to get the current time (i.e. now
) and the elapsed
time at any moment.
# Returns current time
# (and, if provided, prints the event's name)
def now(eventName = ''):
if eventName:
print(f'Started {eventName}..')
return datetime.now()
# Store current time as `start`
start = now()
# Returns time elapsed since `beginning`
# (and, optionally, prints the duration in seconds)
def elapsed(beginning = start, log = False):
duration = datetime.now() - beginning;
if log:
print(f'{duration.seconds}s')
return duration
With those utility functions defined, we can measure the duration of different events.
# Define time to wait
wait_seconds = 2
# Measure duration (while waiting for 2 seconds)
beginning = now(f'{wait_seconds}-second wait.')
# Wait.
time.sleep(wait_seconds)
# Get time delta.
elapsed_time = elapsed(beginning, True)
# Prints 0:00:02.004004
# Get seconds.
elapsed_seconds = elapsed_time.seconds
# Prints 2
# Get microseconds.
elapsed_microseconds = elapsed_time.microseconds
# Prints 4004
If you found this useful, you might want to join my mailing lists; or take a look at other posts about code, Python, React, and TypeScript.
Here's how to sort a Python list by a key of its items. Check this post if you're looking to sort a list of dictionaries instead.
# A list of people
# name, age, location
people = [
['Nono', 32, 'Spain'],
['Alice', 20, 'Wonderland'],
['Phillipe', 100, 'France'],
['Jack', 45, 'Caribbean'],
]
# Sort people by age, ascending
people_sorted_by_age_asc = sorted(people, key=lambda x: x[1])
# [
# ['Alice', 20, 'Wonderland'],
# ['Nono', 32, 'Spain'],
# ['Jack', 45, 'Caribbean'],
# ['Phillipe', 100, 'France']
# ]
# Sort people by age, descending
people_sorted_by_age_desc = sorted(people, key=lambda x: -x[1])
# [
# ['Phillipe', 100, 'France'],
# ['Jack', 45, 'Caribbean'],
# ['Nono', 32, 'Spain'],
# ['Alice', 20, 'Wonderland']
# ]
# Sort people by name, ascending
people_sorted_by_name_desc = sorted(people, key=lambda x: x[0])
# [
# ['Alice', 20, 'Wonderland'],
# ['Jack', 45, 'Caribbean'],
# ['Nono', 32, 'Spain'],
# ['Phillipe', 100, 'France']
# ]
Here's how to read contents from a comma-separated value (CSV) file in Python; maybe a CSV that already exists or a CSV you saved from Python.
import csv
csv_file_path = 'file.csv'
with open(csv_file_path, encoding='utf-8') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
# Print the first five rows
for row in list(csv_reader)[:5]:
print(row)
# Print all rows
for row in list(csv_reader)[:5]:
print(row)