Visual Language Model Icon

News

A Google DeepMind AI language model is now making descriptions for YouTube Shorts

Google just combined DeepMind and Google Brain into one big AI team, and on Wednesday, the new Google DeepMind shared details on how one of its visual language models (VLM) is being used to ...

Ars Technica2y

Google’s PaLM-E is a generalist robot brain that takes commands

On Monday, a group of AI researchers from Google and the Technical University of Berlin unveiled PaLM-E, a multimodal embodied visual-language model (VLM) with 562 billion parameters that ...

Ars Technica2y

Microsoft unveils AI model that understands image content, solves visual puzzles

a multimodal model that can reportedly analyze images for content, solve visual puzzles, perform visual text recognition, pass visual IQ tests, and understand natural language instructions.

VentureBeat11mon

Microsoft drops Florence-2, a unified model to handle a variety of vision tasks

For instance, in a zero-shot captioning test on the COCO dataset, both 232M and 771M versions of Florence outperformed Deepmind’s 80B parameter Flamingo visual language model with scores of 133 ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results