Refine
Clear All
Your Track:
Live:
Search in:
Gradient Dissent: Exploring Machine Learning, AI, Deep Learning, Computer Vision
Gradient Dissent: Exploring Machine Learning, AI, Deep Learning, Computer Vision

Gradient Dissent: Exploring Machine Learning, AI, Deep Learning, Computer Vision

Gradient Dissent is a machine learning podcast hosted by Lukas Biewald that takes you behind-the-scenes to learn how industry leaders are putting deep learning models in production at Facebook, Google, Lyft, OpenAI, and more.

Available Episodes 10

On this episode, we’re joined by Brandon Duderstadt, Co-Founder and CEO of Nomic AI. Both of Nomic AI’s products, Atlas and GPT4All, aim to improve the explainability and accessibility of AI.

We discuss:

- (0:55) What GPT4All is and its value proposition.

- (6:56) The advantages of using smaller LLMs for specific tasks. 

- (9:42) Brandon’s thoughts on the cost of training LLMs. 

- (10:50) Details about the current state of fine-tuning LLMs. 

- (12:20) What quantization is and what it does. 

- (21:16) What Atlas is and what it allows you to do.

- (27:30) Training code models versus language models.

- (32:19) Details around evaluating different models.

- (38:34) The opportunity for smaller companies to build open-source models. 

- (42:00) Prompt chaining versus fine-tuning models.

Resources mentioned:

Brandon Duderstadt - https://www.linkedin.com/in/brandon-duderstadt-a3269112a/

Nomic AI - https://www.linkedin.com/company/nomic-ai/

Nomic AI Website - https://home.nomic.ai/

Thanks for listening to the Gradient Dissent podcast, brought to you by Weights & Biases. If you enjoyed this episode, please leave a review to help get the word out about the show. And be sure to subscribe so you never miss another insightful conversation.

#OCR #DeepLearning #AI #Modeling #ML

On this episode, we’re joined by Soumith Chintala, VP/Fellow of Meta and Co-Creator of PyTorch. Soumith and his colleagues’ open-source framework impacted both the development process and the end-user experience of what would become PyTorch.

We discuss:

- The history of PyTorch’s development and TensorFlow’s impact on development decisions.

- How a symbolic execution model affects the implementation speed of an ML compiler.

- The strengths of different programming languages in various development stages.

- The importance of customer engagement as a measure of success instead of hard metrics.

- Why community-guided innovation offers an effective development roadmap.

- How PyTorch’s open-source nature cultivates an efficient development ecosystem.

- The role of community building in consolidating assets for more creative innovation.

- How to protect community values in an open-source development environment.

- The value of an intrinsic organizational motivation structure.

- The ongoing debate between open-source and closed-source products, especially as it relates to AI and machine learning.



Resources:

- Soumith Chintala

https://www.linkedin.com/in/soumith/

- Meta | LinkedIn

https://www.linkedin.com/company/meta/

- Meta | Website

https://about.meta.com/

- Pytorch

https://pytorch.org/




Thanks for listening to the Gradient Dissent podcast, brought to you by Weights & Biases. If you enjoyed this episode, please leave a review to help get the word out about the show. And be sure to subscribe so you never miss another insightful conversation.





#OCR #DeepLearning #AI #Modeling #ML

On this episode, we’re joined by Andrew Feldman, Founder and CEO of Cerebras Systems. Andrew and the Cerebras team are responsible for building the largest-ever computer chip and the fastest AI-specific processor in the industry.

We discuss:

- The advantages of using large chips for AI work.

- Cerebras Systems’ process for building chips optimized for AI.

- Why traditional GPUs aren’t the optimal machines for AI work.

- Why efficiently distributing computing resources is a significant challenge for AI work.

- How much faster Cerebras Systems’ machines are than other processors on the market.

- Reasons why some ML-specific chip companies fail and what Cerebras does differently.

- Unique challenges for chip makers and hardware companies.

- Cooling and heat-transfer techniques for Cerebras machines.

- How Cerebras approaches building chips that will fit the needs of customers for years to come.

- Why the strategic vision for what data to collect for ML needs more discussion.

Resources:

Andrew Feldman - https://www.linkedin.com/in/andrewdfeldman/

Cerebras Systems - https://www.linkedin.com/company/cerebras-systems/

Cerebras Systems | Website - https://www.cerebras.net/

Thanks for listening to the Gradient Dissent podcast, brought to you by Weights & Biases. If you enjoyed this episode, please leave a review to help get the word out about the show. And be sure to subscribe so you never miss another insightful conversation.

#OCR #DeepLearning #AI #Modeling #ML

On this episode, we’re joined by Harrison Chase, Co-Founder and CEO of LangChain. Harrison and his team at LangChain are on a mission to make the process of creating applications powered by LLMs as easy as possible.

We discuss:

- What LangChain is and examples of how it works. 

- Why LangChain has gained so much attention. 

- When LangChain started and what sparked its growth. 

- Harrison’s approach to community-building around LangChain. 

- Real-world use cases for LangChain.

- What parts of LangChain Harrison is proud of and which parts can be improved.

- Details around evaluating effectiveness in the ML space.

- Harrison's opinion on fine-tuning LLMs.

- The importance of detailed prompt engineering.

- Predictions for the future of LLM providers.


Resources:


Harrison Chase - https://www.linkedin.com/in/harrison-chase-961287118/

LangChain | LinkedIn - https://www.linkedin.com/company/langchain/

LangChain | Website - https://docs.langchain.com/docs/




Thanks for listening to the Gradient Dissent podcast, brought to you by Weights & Biases. If you enjoyed this episode, please leave a review to help get the word out about the show. And be sure to subscribe so you never miss another insightful conversation.




#OCR #DeepLearning #AI #Modeling #ML

On this episode, we’re joined by Jean Marc Alkazzi, Applied AI at idealworks. Jean focuses his attention on applied AI, leveraging the use of autonomous mobile robots (AMRs) to improve efficiency within factories and more.

We discuss:

- Use cases for autonomous mobile robots (AMRs) and how to manage a fleet of them. 

- How AMRs interact with humans working in warehouses.

- The challenges of building and deploying autonomous robots.

- Computer vision vs. other types of localization technology for robots.

- The purpose and types of simulation environments for robotic testing.

- The importance of aligning a robotic fleet’s workflow with concrete business objectives.

- What the update process looks like for robots.

- The importance of avoiding your own biases when developing and testing AMRs.

- The challenges associated with troubleshooting ML systems.

Resources: 

Jean Marc Alkazzi - https://www.linkedin.com/in/jeanmarcjeanazzi/

idealworks |LinkedIn - https://www.linkedin.com/company/idealworks-gmbh/

idealworks | Website - https://idealworks.com/

Thanks for listening to the Gradient Dissent podcast, brought to you by Weights & Biases. If you enjoyed this episode, please leave a review to help get the word out about the show. And be sure to subscribe so you never miss another insightful conversation.

#OCR #DeepLearning #AI #Modeling #ML

On this episode, we’re joined by Stella Biderman, Executive Director at EleutherAI and Lead Scientist - Mathematician at Booz Allen Hamilton.

EleutherAI is a grassroots collective that enables open-source AI research and focuses on the development and interpretability of large language models (LLMs).

We discuss:

- How EleutherAI got its start and where it's headed.

- The similarities and differences between various LLMs.

- How to decide which model to use for your desired outcome.

- The benefits and challenges of reinforcement learning from human feedback.

- Details around pre-training and fine-tuning LLMs.

- Which types of GPUs are best when training LLMs.

- What separates EleutherAI from other companies training LLMs.

- Details around mechanistic interpretability.

- Why understanding what and how LLMs memorize is important.

- The importance of giving researchers and the public access to LLMs.

Stella Biderman - https://www.linkedin.com/in/stellabiderman/

EleutherAI - https://www.linkedin.com/company/eleutherai/

Resources:

- https://www.eleuther.ai/

Thanks for listening to the Gradient Dissent podcast, brought to you by Weights & Biases. If you enjoyed this episode, please leave a review to help get the word out about the show. And be sure to subscribe so you never miss another insightful conversation.


#OCR #DeepLearning #AI #Modeling #ML

On this episode, we’re joined by Aidan Gomez, Co-Founder and CEO at Cohere. Cohere develops and releases a range of innovative AI-powered tools and solutions for a variety of NLP use cases.

We discuss:

- What “attention” means in the context of ML.

- Aidan’s role in the “Attention Is All You Need” paper.

- What state-space models (SSMs) are, and how they could be an alternative to transformers. 

- What it means for an ML architecture to saturate compute.

- Details around data constraints for when LLMs scale.

- Challenges of measuring LLM performance.

- How Cohere is positioned within the LLM development space.

- Insights around scaling down an LLM into a more domain-specific one.

- Concerns around synthetic content and AI changing public discourse.

- The importance of raising money at healthy milestones for AI development.

Aidan Gomez - https://www.linkedin.com/in/aidangomez/

Cohere - https://www.linkedin.com/company/cohere-ai/



Thanks for listening to the Gradient Dissent podcast, brought to you by Weights & Biases. If you enjoyed this episode, please leave a review to help get the word out about the show. And be sure to subscribe so you never miss another insightful conversation.


Resources:

- https://cohere.ai/

- “Attention Is All You Need”




#OCR #DeepLearning #AI #Modeling #ML

Jonathan Frankle, Chief Scientist at MosaicML and Assistant Professor of Computer Science at Harvard University, joins us on this episode. With comprehensive infrastructure and software tools, MosaicML aims to help businesses train complex machine-learning models using their own proprietary data.

We discuss:

- Details of Jonathan’s Ph.D. dissertation which explores his “Lottery Ticket Hypothesis.”

- The role of neural network pruning and how it impacts the performance of ML models.

- Why transformers will be the go-to way to train NLP models for the foreseeable future.

- Why the process of speeding up neural net learning is both scientific and artisanal. 

- What MosaicML does, and how it approaches working with clients.

- The challenges for developing AGI.

- Details around ML training policy and ethics.

- Why data brings the magic to customized ML models.

- The many use cases for companies looking to build customized AI models.

Jonathan Frankle - https://www.linkedin.com/in/jfrankle/

Resources:

- https://mosaicml.com/

- The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks



Thanks for listening to the Gradient Dissent podcast, brought to you by Weights & Biases. If you enjoyed this episode, please leave a review to help get the word out about the show. And be sure to subscribe so you never miss another insightful conversation.


#OCR #DeepLearning #AI #Modeling #ML

About This Episode

Shreya Shankar is a computer scientist, PhD student in databases at UC Berkeley, and co-author of "Operationalizing Machine Learning: An Interview Study", an ethnographic interview study with 18 machine learning engineers across a variety of industries on their experience deploying and maintaining ML pipelines in production.

Shreya explains the high-level findings of "Operationalizing Machine Learning"; variables that indicate a successful deployment (velocity, validation, and versioning), common pain points, and a grouping of the MLOps tool stack into four layers. Shreya and Lukas also discuss examples of data challenges in production, Jupyter Notebooks, and reproducibility.

Show notes (transcript and links): http://wandb.me/gd-shreya

---

💬 *Host:* Lukas Biewald

---

*Subscribe and listen to Gradient Dissent today!*

👉 Apple Podcasts: http://wandb.me/apple-podcasts​​

👉 Google Podcasts: http://wandb.me/google-podcasts​

👉 Spotify: http://wandb.me/spotify​

About this episode

In this episode of Gradient Dissent, Lukas interviews Dave Rogenmoser (CEO & Co-Founder) and Saad Ansari (Director of AI) of Jasper AI, a generative AI company with a focus on text generation for content like blog posts, articles, and more. The company has seen impressive growth since it's launch at the start of 2021.

Lukas talks with Dave and Saad about how Jasper AI was able to sell the capabilities of large language models as a product so successfully, and how they are able to continually improve their product and take advantage of steps forward in the AI industry at large.

They also speak on how they keep their business ahead of the competition, where they put their focus on in terms of R&D, and how they are able to keep the insights they've learned over the years relevant at all times as their company grows in employee count and company value.

Other topics include the potential use of generative AI in domains it hasn't necessarily seen yet, as well as the impact that community and user feedback plays on the constant tweaking and tuning processes that machine learning models go through.

Connect with Dave & Saad:

Find Dave on Twitter and LinkedIn.

Find Saad on LinkedIn.

---

💬 Host: Lukas Biewald

---

Subscribe and listen to Gradient Dissent today!

👉 Apple Podcasts: http://wandb.me/apple-podcasts​​

👉 Google Podcasts: http://wandb.me/google-podcasts​

👉 Spotify: http://wandb.me/spotify​