Applications
12 articles in this category
A Coding Implementation to Automating LLM Quality Assurance with DeepEval, Custom Retrievers, and LLM-as-a-Judge Metrics
This article details a coding implementation for automated LLM quality assurance, achieving rigorous testing through DeepEval, custom retrievers, and LLM-as-a-judge metrics.
Google DeepMind’s WeatherNext 2 Uses Functional Generative Networks For 8x Faster Probabilistic Weather Forecasts
Google DeepMind’s WeatherNext 2 achieves 6.5% CRPS improvement over GenCast, delivering faster and more accurate probabilistic weather forecasts.
Baidu Releases ERNIE-4.5-VL-28B-A3B-Thinking: An Open-Source and Compact Multimodal Reasoning Model Under the ERNIE-4.5 Family
Baidu’s ERNIE-4.5-VL-28B-A3B-Thinking achieves 3B active parameters per token with 30B total parameters, outperforming larger models on multimodal benchmarks.
Creating AI-Ready APIs: Best Practices for Enhancing AI Performance and Reliability
Explore Postman's checklist for building AI-ready APIs, emphasizing machine-readable metadata, error semantics, and consistency to ensure AI agents interact reliably with your systems.
Liquid AI Releases LFM2-ColBERT-350M: A Compact Late Interaction Model for Multilingual Cross-Lingual Retrieval
Liquid AI introduces LFM2-ColBERT-350M, a 350M-parameter late interaction retriever optimized for multilingual and cross-lingual search, offering high accuracy and fast inference speeds.