lecture 08
NLP101
+
Embedding
+
a little workshop
lecture 08
NLP101
+
Embedding
(word2vec as an example)
+
a lil workshop
(language as an interface with generative AI)
SeAts APp SEAtS ApP SEaTS APP
🧘
ai artist of this week: portrait xo
after today's lecture:
-- what is (not) NLP
-- solving NLP basic tasks with Apple Natural Language Framework
-- intuition about embedding
-- using our language to interact with generative AI
NLP:
Natural Language Processing
-- recall data modality?
data modality:
image, text, audio, sensor data, etc.
for example,
object detection is about image data
NLP?
-- look it up on wikipedia
-- there is no hard definition but it is (almost) everything about human language
-- it is an interdisciplinary subfield
Example applications of NLP:
text-to-speech πŸ—£οΈ
speech-to-text πŸ‘‚
machine translation 🧠
image captioning πŸ§‘β€πŸ«
text-to-image generation πŸ§‘β€πŸŽ¨
etc.
also NLP:
-- Lemmatization
-- Named Entity Recognition
-- Part-of-speech tagging
etc.
❓❓❓
engaging with languages is very natural to us,
(it is a given, we can use it without fully understanding how our language system works)
it is very complex and comprises many low-level sub tasks, including:
-- Language identification
-- Lemmatization
-- Named Entity Recognition
-- Part-of-speech tagging
-- Tokenization
etc.
For today's lecture, we go through each one of these basic tasks with a reference to

🍎Apple Natural Language Framework🍎 solutions
There is no need to understand the model, just to know how to use itπŸ‘
open an xcode playground, import the framework:
import NaturalLanguage 
import Foundation 
import CoreML
Language identification
--1. what is it about? πŸ₯·
--- try answer by filling out blanks in: Given an input of __, the solution model should produce an output of __
--2. what are the possible use cases? πŸ§‘β€πŸ³
--3. paste and run the example codes! πŸ•ΉοΈ
Named Entity Recognition
--1. what is it about? πŸ₯·
--- try answer by filling out blanks in: Given an input of __, the solution model should produce an output of __
--2. what are the possible use cases? πŸ§‘β€πŸ³
--3. paste and run the example codes! πŸ•ΉοΈ
Part-of-speech tagging
--1. what is it about? πŸ₯·
--- try answer by filling out blanks in: Given an input of __, the solution model should produce an output of __
--2. what are the possible use cases? πŸ§‘β€πŸ³
--3. paste and run the example codes! πŸ•ΉοΈ
Tokenization
--1. what is it about? πŸ₯·
--- try answer by filling out blanks in: Given an input of __, the solution model should produce an output of __
--2. what are the possible use cases? πŸ§‘β€πŸ³
--3. paste and run the example codes! πŸ•ΉοΈ
a very recent text-to-audio generation model with playable demo: AudioLDM
AI intuitions 02
embedding
recap: one hot encoding

-- how to encode today being Thursday (day-of-the-week)?
-- and what is the size of the vector?
new info: distance between two vectors

-- Euclidean
-- Cosine
etc.
Quote by Douglas Adams
β€œI've come up with a set of rules that describe our reactions to technologies:
1. Anything that is in the world when you’re born is normal and ordinary and is just a natural part of the way the world works.
2. Anything that's invented between when you’re fifteen and thirty-five is new and exciting and revolutionary and you can probably get a career in it.
3. Anything invented after you're thirty-five is against the natural order of things.”
how to encode(numberify) the first sentence?
Anything that is in the world when you’re born is normal and ordinary and is just a natural part of the way the world works.
here is one possible scheme for encoding:
step 1
tokenization
step 2
one-hot encoding:
assign a vector to each unique word (example on the board )
what is the size of each one-hot vector?
-- recall what is the size of the one-hot vector of today being thursday?
some issues of such numberification scheme :
1. vector size is large
(more memory requried and higher computation costs)
some issues of such numberification scheme :
2. [IMPORTANT] relational information between words is lost:
-- distance between different word pairs are the same
-- but we know that in semantics, some words are similar/closer
e.g. "normal" is closer with "ordinary" than it is with "born"
me on the whiteboard:
calculate the distance between one-hot vector of
-- "normal" and "ordinary" than it is with "born"
-- "normal" and "born"
same distances!
-- our semantic "relational information" is not reflected in the one-hot encoding scheme!
word2vec (an AI model type) for the rescue
it has an ingenious training target:
-- given a word, predict its surrounding words (the context)
-- given surrounding words (the context), predict the centre word
model details in the future intuition series... stay tuned!
advantages:

-- smaller embedding vector size (pre-defined before training)
-- relations are preserved
an implication:
compression brings abstraction
because it has to discover and use "relations" to save some memory space
(and abstraction seems to be crucial for intelligence)
Play around with a pre-trained word2vec model here
making your own word2vec equation:
-- make a hypothesis on analogous words
-- try verifying it (using the notebook)
well done everyone πŸŽ‰
we have gone through MSc-level content AGAIN
lil workshop
- Language as mediator:
-- interact with text-to-image and text-to-audio generative AIs using the same/similar text prompts
text-to-image generative AIs
- SD
- this list of treasures
text-to-audio generative AIs
- AudioLDM
- bark
lil workshop
Language as mediator:
-- use the same/similar prompt to generate a piece of image and audio
-- select any text-to-X model
-- have funnn πŸ€ͺ
-- optional: combine the linked image and audio using imovie
today we talked about:

-- introduction to NLP πŸŽƒ
-- some basic NLP tasks solved with Apple NL framework
--- Language identification
--- Lemmatization
--- Named Entity Recognition
--- Part-of-speech tagging
--- Tokennization
today we talked about:

-- intuition about embedding (how to numberify words) 🧚
--- smaller embedding vector size
--- preserved relational information
--- word2vec as an example
today we talked about:

-- using language to connect text-to-X models of different modalities πŸŒ‰