Further with ML/DL
NYU K12 STEM Education: Machine Learning (Day 8)
Unsupervised learning
- What if we don’t have labelled data for the given task?
- The dataset still holds structure, we just don’t have access to it
- Or what if there is a need to create data?
- Example: Clustering, Generative AI, etc.
- Let’s look at some unsupervised models
K-Means Clustering
- Works by selecting ’k’ arbitrary centroids for clusters
- Euclidian Distance is used to assign points to a cluster
- (We can use other measures as well)
- Centroids are updated and points are reassigned till convergence
K-Means Drawbacks
- What drawbacks can it have?
- k matters a lot!
- The algorithm depends heavily on the initial centroids
- categorical data doesn’t have a natural notion of distance or similarity
K-Means Evaluation
- How to evaluate the model? We don’t have any labels?
- Inertia (\(J\)) measures the sum of squared distances between data points (\(x_i\)) and their assigned cluster centroids (\(\mu k\)).
- Goal: Have low inertia!
Next Steps
What we have completed
- Looked at foundational steps and models
- Regression tasks: (fish weight and housing prices)
- Classification tasks: (cancer prediction and Iris)
- Deep Neural Networks
- Convolutional Neural Networks
Building on these Topics
- The goal with engineering is to take what we know and try to build on it.
- In case of machine learning; can we use a CNN as a backbone to solve more complicated tasks with more complicated models?
- The best way to do this is to independently learn from various sources.
- What are some resources that we can use?
Resources to build on your learning
- Finding Code: Github and Machine Learning Mastery
- Finding Papers: There are many great conferences such as NeurIPS and ICML that are constantly publishing papers across topics and fields - ArXiv and SciHub
- Theory: StatQuest, computerphile and 3 Blue 1 Brown on Youtube.
Supverised Learning
Object Detection
- Faster-RCNN
- YoLo
- Divides the image into nxn grid-cells
- For each grid cell,
- predicts B bouding boxes and it box confidence score
- Each box will have its class probability
- All class probability are combined to detect one object
Semantic Segmentation
- Every Pixel is associated with a class
- Encoder-decoder structure
- Decode using transposed convolution or deconvolution
Instance Segmentation
Autoencoders
- Encoder-Decoder structure
- Encoder helps in creating latent representations
- Decoder helps in generating outputs from the latent representation
Autoencoder for denoising
Variational Autoencoders
Generative Models
Generate images, art, speech. Generation architectures can be modified based on the task at hand.
Benefits and Use Cases
- When dataset collection is difficult or expensive. (For example MRI scans).
- When there is a limit on available data. (With rare cancers, there may not be many positive cases.) (Start of COVID with few recorded cases.)
- Various novel applications. (Generation or Art)
GANs: Generative Adversarial Networks
- Invented in 2014 by Ian Goodfellow
- Goal: generate samples never seen before
- How: game between two networks
- Generator Network
- Discriminator Network
- Goal of Generator: generate fake samples indistinguishable from real samples
- Goal of Discriminator: be able to tell apart real and fake samples
Applications of GANs
- Cats that Don’t Exist
- Image Colorization
- Image Synthesis
- Image Super Resolution
Social Impact of Machine Learning
How Would You Use ML/DL?
- Think about potential applications with deep learning.
- Discuss its social implications.
Can AI/ML be Biased?
- At the start, a Neural Network just has randomly initialized weights.
- It then trains and backprogates on a given dataset.
- Do our nodes harbor any racism, sexism, homophobia or transphobia?
- No!
- Neural Networks aren’t sentient.
- Neural Networks have no understanding of human emotions, biases or anything else.
Biased model outputs
- PULSE is a face depixelizing algorithm, but…
- Biases Inherent in Data: CheXpert
- CheXpert is a dataset of medical images in the form of Chest X-Rays. The dataset is inherently biased as when looking for rare diseases, most patients would test negative.
- More than 90% of sames are negative cases. As a result, a model can assume every patient is disease free and still have an accuracy of 90%
- As a result, models aren’t incentivized to learn about underrepresented classes in a dataset.
- Biases not Inherent in Data: Celeb A
- In case of CheXpert, when studying rare diseases, it is more likely to not have the disease than having it.
- But sometimes, data in the real world isn’t biased but our dataset might be.
- Celeb-A is a great case for such a problem.
- Celeb-A: ”traditionally attractive”. predomintally white and cis. Heavy make-up. Potential Photoshop. 4K cameras.
- In the real world: Most people not models. People of Colour. Trans and Non-binary. Images aren’t taken on professional cameras with professional makeup.
Real World Biases leaking into Machine Learning
- Keep in mind, the bias comes from Biased Data!! Not the model having any bigotry.
- Bigotry and under-represented data in the real world can leak into machine learning.
- Biases and Racism in Law Enforcement can leak into model predictions. This only furthers existing inequity. AI in law enforcement
- Dataset Generation might often be predominantly white and cishet masculine with the sources of data and the engineers building these datasets not realizing the importance of diversity in datasets.
Insidious Effects on Machine Learning performance
(From the article Design AI so that it’s fair)
- When Google Translate converts news articles written in Spanish into English, phrases referring to women often become ‘he said’ or ‘he wrote’.
- Software designed to warn people using Nikon cameras when the person they are photographing seems to be blinking tends to interpret Asians as always blinking.
- Google misclassifying people as gorillas,
- Chat bot trained on data from tweets ”Tay” learns to be racist and sexist as a result of the sheer number bigoted twitter users.
How do we solve these issues?
Safety of AI
Boston Dynamics Parkour Atlas: What machine learning algorithms might have been used here?
- The same model can have drastically different performance for different hyper-parameters.
- 100% accuracy is rarely achieved on unseen data.
- Should we let a medical robot with CNN-based vision system perform surgery autonomously?
- If a self-driving car crashes and hurts people, who should be responsible for it?
Carbon Footprint of Deep Learning
Course Takeaway
- ML is the combination of math and computer science. We’ve only shown you a subsection
- Supervised Learning: Linear/Logistic Regression and Neural Networks
- Deep learning has wide applications, but we are also responsible for its consequences. —The greater the power, the greater the responsibility!