January 5, 2024

Google’s Gemini: new State-of-the-art multimodal giant

The best time to establish protocols with your clients is when you onboard them.

Heading

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Google has made significant strides in the field of AI with the introduction of the highly anticipated Gemini family of models. Following the success of the PaLM models, Google DeepMind has unveiled new high-end generative models, this time with multimodal capabilities — Gemini family of models

‍

What is Gemini AI?

Gemini is a family of generative AI models developed by Google DeepMind that is designed for multimodal use cases. It came up with remarkable benchmarks on image, audio, video, text , code understanding. It is even said to outperform State of the art models such as GPT4 in some benchmarks and other human experts also.

One of the Gemini models — Gemini Ultra model tops 30 out of 32 popular LLM benchmarks evaluation.

‍

Sizes

The Gemini family of models comes with 3 variants — Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases.

Ultra is for highly-complex tasks such as reasoning and multimodal tasks
Pro for enhanced performance and deployability at scale
Nano for on-device applications.

Each size is specifically tailored to address different computational limitations and application requirements.

‍

Comparison of SOTA

Gemini surpasses OpenAI’s GPT models in multiple benchmark evaluations. Thereby setting a new state of the art across a wide range of text, image, audio, and video benchmarks.

On MMLU dataset, Gemini Ultra can outperform all existing models including GPT4, achieving an accuracy of 90.04%. MMLU is a popular benchmark, which measures knowledge across a set of 57 subjects including advanced Science, Technology, Engineering, Mathematics(STEM) subjects. Human experts are gauged at 89.8% on the MMLU and Gemini Ultra is the first model to exceed this threshold.

‍

Gemini Ultra also passes GPT4 Vision with a score of 59% on MMMU benchmark whereas the latter model stands second with 56% score.

MMMU benchmark evaluates model mainly on its multimodal capabilities on various multimodal questions, with an advanced perception and deliberate reasoning.

‍

AI Safety

LLM safety is being defined as the ability of an LLM to avoid causing harm to its users. Without safety precautions, an LLM can’t sustain in the long run. Safety filters should be enabled in LLMs to filter out toxic language, hate speech prompts and responses.

As Google is one of the forerunners for AI safety policy, the Gemini models are pretrained in accordance with their Google’s AI principles 2023. The Gemini API has built-in protections against core harms, such as content that endangers child safety.

‍

The adjustable safety filters in Gemini cover the following categories :

Harassment
Hate speech
Sexually explicit
Dangerous

‍

How to access Gemini AI?

Currently Google offers free of cost for Pro version for text input and pro vision version for text, image input via AI studio. To access Pro version, Bard Chatbot is currently using a fine-tuned version of Gemini Pro which replaces PaLM v2.

Gemini Nano is exclusively only for on-devices and currently Pixel 8 Pro smartphone engineered to run Gemini Nano, which powers new feature like Record summarizer, Smart Reply in Gboard etc.

Gemini ultra is undergoing extensive trust and safety checks with Reinforcement learning with Human feedback (RLHF) techniques and will be available in Bard advanced in 2024.

‍

Limitations

While Gemini dazzles with its capabilities, it’s not without limitations.

The Gemini model which is not a opensource unlike Meta’s LLAMA and Google previous PaLM models, so it is unable to finetune the model to our dataset.
The new SOTA Gemini Ultra require several GPUs, TPU power to run which is quite expensive.

‍

Conclusion

Google’s new Gemini AI is expected to be really powerful and flexible LLMs for the near future. It’s a big leap forward in how we use and understand AI. This multimodal giant from Google is likely to change the game, opening up exciting possibilities for creativity and innovation. It is so exciting to see what Gemini AI can do and how it can make a positive impact on the world! For more about Gemini, check this out.

CodeStax.Ai

Profile

January 5, 2024

min read

Subscribe to our newsletter

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Share this article

Google’s Gemini: new State-of-the-art multimodal giant

Heading

CONTENTS:

What is Gemini AI?

Sizes

Comparison of SOTA

AI Safety

How to access Gemini AI?

Limitations

Conclusion

More articles

CodeStax.Ai

Serverless Architectures: Beyond Lambda

Serverless architectures specify a change in our process to produce and execute applications.

CodeStax.Ai

AWS Neptune Demystified: Your Guide to Graph Databases and Gremlin Queries

The knowledge on graph databases is crucial as we live

CodeStax.Ai

Introduction to AWS SAM CLI: Simplify Serverless Development

The Serverless architecture in cloud computing helps developers

CodeStax.Ai

Automating AWS Lambda Version Cleanup with Node.js and AWS SDK

In the realm of serverless computing, AWS Lambda functions

CodeStax.Ai

AWS CodeCommit — Version control for beginners

Nowadays, software development is a field where speed is crucial.

CodeStax.Ai

How to deploy Bun.js in AWS Lambda?

JavaScript is one of the most popular and widely used

CodeStax.Ai

Amazon CodeWhisperer: AI-Powered Suggestions and Security Boost

Amazon CodeWhisperer utilizes machine learning

CodeStax.Ai

Elements on a web page can be located using XML expressions with Selenium’s XPath locator.

S3 is excellent for storing files

CodeStax.Ai

AWS — Log Anomaly Detection and Recommendations

Developers can now more effectively monitor and troubleshoot their applications

CodeStax.Ai

AWS Fargate and AWS Lambda which one to choose for your project?

AWS Fargate and AWS Lambda

CodeStax.Ai

Advanced Queries For AWS Timestream

Window functions in Timestream give you extensive analytical capabilities

CodeStax.Ai

AWS Lambda Foundations

There are three patterns to invoke a Lambda function, called Invocation models. The invocation model to be used depends on the event source

CodeStax.Ai

Automating Reconciliation Using AWS Glue

AWS Glue is a fully managed ETL service that makes it easy to move data

CodeStax.Ai

AWS Lambda with SQS — Setup SQS Trigger to Lambda

AWS Lambda is an event-driven, server-less computing platform provided by Amazon.

CodeStax.Ai

Storing Secure Configuration Data with AWS Parameter Store: A Step-by-Step Tutorial

Amazon Web Services (AWS) Parameter Store is a service that enables you to

CodeStax.Ai

AWS Timestream — Introduction

AWS Timestream is comparable to Graphite and Influx.

CodeStax.Ai

Getting Started With AWS Fargate

Deploying the application to the web is a burden and maintaining the server is also another big task for the DevOps engineers.

CodeStax.Ai

Managing users with AWS Cognito

Cognito is known for authentication, authorization and user management for mobile and web applications

CodeStax.Ai

Streaming QLDB Journal data to Lambda

In this article we’ll discuss how to stream QLDB (Quantum Ledger Database)

CodeStax.Ai

Creating an Automated Deployment Pipeline - CodeCommit to Lambda

“Merge conflict” is one of the worst messages a developer can see in Git.

CodeStax.Ai

Encryption is a way of scrambling data so that only authorized parties can understand the information.

Quantum Ledger Database (QLDB) is a No-SQL (Semi-SQL & Semi-NoSQL)

CodeStax.Ai

Speed up your lambda functions

AWS Lambda is a popular serverless computing service offered by Amazon Web Services (AWS).

CodeStax.Ai

Creating Serverless APIs with DynamoDB and Lambda

This article will teach you how to build a server-less backend API using DynamoDB as the database.