lecture 08
NLP101
+
Embedding
+
a little workshop
lecture 08
NLP101
+
Embedding
(word2vec as an example)
+
a lil workshop
(language as an interface with generative AI)
SeAts APp SEAtS ApP SEaTS APP
π§
after today's lecture:
-- what is (not) NLP
-- solving NLP basic tasks with Apple Natural Language Framework
-- intuition about embedding
-- using our language to interact with generative AI
NLP:
Natural Language Processing
-- recall data modality?
data modality:
image, text, audio, sensor data, etc.
for example,
object detection is about image data
NLP?
-- look it up on wikipedia
-- there is no hard definition but it is (almost) everything about human language
-- it is an interdisciplinary subfield
Example applications of NLP:
text-to-speech π£οΈ
speech-to-text π
machine translation π§
image captioning π§βπ«
text-to-image generation π§βπ¨
etc.
also NLP:
-- Lemmatization
-- Named Entity Recognition
-- Part-of-speech tagging
etc.
βββ
engaging with languages is very natural to us,
(it is a given, we can use it without fully understanding how our language system works)
it is very complex and comprises many low-level sub tasks, including:
For today's lecture, we go through each one of these basic tasks with a reference to
πApple Natural Language Frameworkπ solutions
There is no need to understand the model, just to know how to use itπ
open an xcode playground, import the framework:
import NaturalLanguage
import Foundation
import CoreML
Language identification
--1. what is it about? π₯·
--- try answer by filling out blanks in: Given an input of __, the solution model should produce an output of __
--2. what are the possible use cases? π§βπ³
--3. paste and run the example codes! πΉοΈ
Named Entity Recognition
--1. what is it about? π₯·
--- try answer by filling out blanks in: Given an input of __, the solution model should produce an output of __
--2. what are the possible use cases? π§βπ³
--3. paste and run the example codes! πΉοΈ
Part-of-speech tagging
--1. what is it about? π₯·
--- try answer by filling out blanks in: Given an input of __, the solution model should produce an output of __
--2. what are the possible use cases? π§βπ³
--3. paste and run the example codes! πΉοΈ
Tokenization
--1. what is it about? π₯·
--- try answer by filling out blanks in: Given an input of __, the solution model should produce an output of __
--2. what are the possible use cases? π§βπ³
--3. paste and run the example codes! πΉοΈ
a very recent text-to-audio generation model with playable demo:
AudioLDM
AI intuitions 02
embedding
recap: one hot encoding
-- how to encode today being Thursday (day-of-the-week)?
-- and what is the size of the vector?
new info: distance between two vectors
-- Euclidean
-- Cosine
etc.
Quote by Douglas Adams
βI've come up with a set of rules that describe our reactions to technologies:
1. Anything that is in the world when youβre born is normal and ordinary and is just a natural part of the way the world works.
2. Anything that's invented between when youβre fifteen and thirty-five is new and exciting and revolutionary and you can probably get a career in it.
3. Anything invented after you're thirty-five is against the natural order of things.β
how to encode(numberify) the first sentence?
Anything that is in the world when youβre born is normal and ordinary and is just a natural part of the way the world works.
here is one possible scheme for encoding:
step 1
tokenization
step 2
one-hot encoding:
assign a vector to each unique word (example on the board )
what is the size of each one-hot vector?
-- recall what is the size of the one-hot vector of today being thursday?
some issues of such numberification scheme :
1. vector size is large
(more memory requried and higher computation costs)
some issues of such numberification scheme :
2. [IMPORTANT] relational information between words is lost:
-- distance between different word pairs are the same
-- but we know that in semantics, some words are similar/closer
e.g. "normal" is closer with "ordinary" than it is with "born"
me on the whiteboard:
calculate the distance between one-hot vector of
-- "normal" and "ordinary" than it is with "born"
-- "normal" and "born"
same distances!
-- our semantic "relational information" is not reflected in the one-hot encoding scheme!
word2vec (an AI model type) for the rescue
it has an ingenious training target:
-- given a word, predict its surrounding words (the context)
-- given surrounding words (the context), predict the centre word
model details in the future intuition series... stay tuned!
advantages:
-- smaller embedding vector size (pre-defined before training)
-- relations are preserved
an implication:
compression brings abstraction
because it has to discover and use "relations" to save some memory space
(and abstraction seems to be crucial for intelligence)
Play around with a pre-trained word2vec model
here
making your own word2vec equation:
-- make a hypothesis on analogous words
-- try verifying it (using the notebook)
well done everyone π
we have gone through MSc-level content AGAIN
lil workshop
- Language as mediator:
-- interact with text-to-image and text-to-audio generative AIs using the same/similar text prompts
text-to-image generative AIs
-
SD
-
this list of treasures
lil workshop
Language as mediator:
-- use the same/similar prompt to generate a piece of image and audio
-- select any text-to-X model
-- have funnn π€ͺ
-- optional: combine the linked image and audio using imovie
today we talked about:
-- introduction to NLP π
-- some basic NLP tasks solved with Apple NL framework
--- Language identification
--- Lemmatization
--- Named Entity Recognition
--- Part-of-speech tagging
--- Tokennization
today we talked about:
-- intuition about embedding (how to numberify words) π§
--- smaller embedding vector size
--- preserved relational information
--- word2vec as an example
today we talked about:
-- using language to connect text-to-X models of different modalities π