Computer Vision
32 articles in this category (Page 1 of 2)
AI NewsComputer VisionMachine Learning
Meta AI's EUPE: A <100M Parameter Universal Vision Encoder Rivaling Specialists
Meta AI introduces EUPE, a compact vision encoder under 100M parameters that matches domain-expert models in classification and dense prediction, achieving 55.2ms latency on iPhone 15 Pro.
Read more
AI NewsComputer VisionArtificial Intelligence
Salesforce AI Introduces FOFPred: A Language-Driven Future Optical Flow Prediction Framework
FOFPred, a new framework from Salesforce AI, achieves state-of-the-art results on robot manipulation benchmarks, reaching a 78.7% Task 5 success rate on CALVIN.
Read more
AI NewsMultimodal AIComputer Vision
Meta AI Open-Sourced Perception Encoder Audiovisual (PE-AV): The Audiovisual Encoder Powering SAM Audio And Large Scale Multimodal Retrieval
Meta AI released PE-AV, a multimodal encoder achieving state-of-the-art performance on audio and video benchmarks with a 10.4 R@1 improvement on AudioCaps.
Read more
AI NewsLanguage ModelComputer Vision
Zhipu AI Releases GLM-4.6V: A 128K Context Vision Language Model with Native Tool Calling
Zhipu AI launched GLM-4.6V, a 106B parameter multimodal model with a 128K token context window, enabling native multimodal function calling for improved agent capabilities.
Read more