This course delves into the complexities of assessing the quality of large language model outputs. It examines the challenges enterprises face due to the subjective and sometimes incorrect nature of LLM responses, including hallucinations and inconsistent results. The course introduces...
Build with Vertex AI: Evaluating Model and Agent Performance
With the Vertex AI Gen AI evaluation service enterprises have the ability to evaluate any generative model, agent or application and benchmark the evaluation results against their own judgement, using their own evaluation criteria. This learning path provides you with real-world hands-on experience of using the Gen AI evaluation service to evaluate single models, compare models and also Gen AI agents performance for your specific use cases.
This path is part of the curriculum to earn the Build with Vertex Technical Expert Badge. To earn your Technical Expert Badge you need to have a valid Professional Google Cloud Certification, earn the Skill Badge at the end of this path, and earn the other four Skill Badges to showcase your Build with Vertex AI skills.
You can earn the other Skill Badges needed at Build with Vertex AI: Working with Gemini, Build with Vertex AI: Generating, Editing, and Responding to Media, and Build with Vertex AI: Building RAG Applications with Vertex AI.
This course equips machine learning practitioners with the essential tools, techniques, and best practices for evaluating both generative and predictive AI models. Model evaluation is a critical discipline for ensuring that ML systems deliver reliable, accurate, and high-performing results in...

Complete the Evaluate Gen AI model and agent performance skill badge to demonstrate your ability to use the Gen AI evaluation service. You will evaluate models to select the best model for a given task, compare models against each other...