Skip to main content

On This Page

Powering Enterprise AI Applications with Data and Open Source Software

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

Transcript

Francisco Javier Arceo discussed the challenges of building production artificial intelligence systems, highlighting the importance of data as the key differentiator and the complexities of managing it effectively. He presented Feast, an open-source feature store, as a solution to address issues like training-serving skew, data consistency, and efficient data serving at scale.

The presentation emphasized that while model development receives significant attention, the majority of effort in production AI lies in the “plumbing” – the data pipelines and infrastructure needed to reliably deliver data to models. Failure to address these data-related challenges can lead to projects failing to deliver business value, resulting in lost investment and a reluctance to fund further AI initiatives.

Why This Matters

The ideal model of AI assumes readily available, clean, and consistent data. However, real-world production systems face significant hurdles in achieving this. Inconsistent data handling between training and serving (training-serving skew) can lead to inaccurate predictions and costly errors, particularly in sensitive domains like finance. The statistic that 87% of data science projects fail underscores the critical need for robust data infrastructure and tools to bridge the gap between experimentation and production.

Key Insights

  • 87% of data science/AI projects fail: This statistic highlights the significant challenge of bringing AI projects to production. (Source: widely cited in the field, referenced in the presentation)
  • Training-serving skew: Inconsistent data transformations between model training and inference can lead to significant performance degradation in production.
  • Feast: An open-source feature store designed to manage data consistency, efficiency, governance, and reliability for machine learning applications, used by companies like Robinhood and NVIDIA.

Working Example

# Example Feast Feature View Definition (simplified)
from feast import FeatureView, ValueType, feature_key

# Define an entity representing a user
user = feature_key("user_id")

# Define a feature view for user age
user_age = FeatureView(
    name="user_age",
    entity=user,
    value_type=ValueType.INT32,
    description="User's age",
)

# Define a feature view for user city
user_city = FeatureView(
    name="user_city",
    entity=user,
    value_type=ValueType.STRING,
    description="User's city",
)

Practical Applications

  • Risk Modeling (Financial Institutions): Feast enables consistent feature serving for fraud detection and credit scoring, ensuring accurate risk assessments.
  • Pitfall: Re-implementing data transformation logic in different languages (e.g., Python for training, Java for serving) leads to inconsistencies and increased maintenance overhead.

References:

Continue reading

Next article

Responsive List of Avatars Using Modern CSS (Part 1)

Related Content