Skip to main content
← All Tags

Data Science

53 articles in this category (Page 3 of 3)

AI NewsAIOpen Source

Embedding Atlas: Apple’s Open-Source Tool for Exploring Large-Scale Embeddings Locally

Apple introduces Embedding Atlas, an open-source browser-based tool for visualizing and analyzing large-scale embeddings without backend infrastructure, enabling interactive exploration of high-dimensional data.

Read more
AI NewsArtificial IntelligenceData Science

How Can We Build Scalable and Reproducible Machine Learning Experiment Pipelines Using Meta Research Hydra?

This article explains how to use Meta's Hydra framework to create scalable and reproducible ML experiments through structured configurations, overrides, and multirun simulations.

Read more
AI NewsBig DataData Science

Building an End-to-End Data Engineering and Machine Learning Pipeline with PySpark in Google Colab

A step-by-step guide to using PySpark in Google Colab for data transformations, SQL analytics, feature engineering, and machine learning model training.

Read more
AI NewsData ScienceOpen Source

Hugging Face AI Sheets Adds Vision Capabilities for Image-Based Data Analysis

Hugging Face releases a significant update to AI Sheets, introducing vision support to extract data from images, generate visuals from text, and edit images directly within a spreadsheet environment, powered by open-source AI models.

Read more
AI NewsData ScienceMachine Learning

Hugging Face Enhances Dataset Streaming for 100x Efficiency

Hugging Face has significantly improved dataset streaming capabilities in their 'datasets' and 'huggingface_hub' libraries, enabling faster and more efficient training on large datasets. Key improvements include reduced API requests, faster data resolution, and enhanced control over streaming pipelines.

Read more