#VisionLanguageModel

@getnews-me.bsky.social

5 months ago

HiViS Boosts Vision-Language Model Speed with Visual Token Hiding

HiViS cuts the drafter’s prefill sequence to just 0.7%‑1.3% of the original input, delivering up to 2.65× faster inference without quality loss in multimodal AI. Read more: getnews.me/hivis-boosts-vision-lang... #hivis #visionlanguagemodel

0 0 0 0

Micha the DevOp

@michabbb.bsky.social

7 months ago

#MistralAI Document #AI: Advanced #OCR solution for complex document processing 📄

📺 www.youtube.com/watch?v=yrx...

🔧 Fine-tuned #VisionLanguageModel specifically designed for document understanding beyond traditional #OCR limitations that plague most business workflows

🧵 👇

1 2 1 0

@arxivlens.bsky.social

8 months ago

A Pan-Organ Vision-Language Model for Generalizable 3D CT Representations
Beeche, C. A., Chen, T. et al.
Paper
Details
#3DCTRepresentations #VisionLanguageModel #MedicalImagingInnovation

0 0 0 0

Ronan

@ronan.mastodon.ronandev.ovh.ap.brid.gy

9 months ago

Download Ollama on macOS Download Ollama for macOS

Pour essayer Qwen2.5-VL :
1. Installer Ollama https://ollama.com/download
2. Télécharger/lancer le modèle : ollama run qwen2.5vl:7b
3. Exemple de prompt : Describe this picture /path/to/file.png

#OpenSource #VLM #LLM #VisionLanguageModel

1 0 0 0

Ronan

@ronan.mastodon.ronandev.ovh.ap.brid.gy

9 months ago

Original post on mastodon.ronandev.ovh

Je viens d'essayer Qwen2.5-VL:7b le dernier modèle de vision de Qwen (Alibaba).

- il est multilingue mais il vaut mieux lui donner des instructions en anglais et peut-être même en chinois
- j'ai pu utiliser le chat interactif pour lui faire analyser une image
- la version 7B est plutôt bonne […]

1 0 1 0

Michiel Bontenbal

@mpbontenbal.eurosky.social

10 months ago

Can you ask questions about an image?

A 'Vision Language Model' links your question with the visual content of an image. It will generate a full response to your question.

#learnAI #VisionLanguageModel

1 0 0 0

Micha the DevOp

@michabbb.bsky.social

1 year ago

#UITARS Desktop: The Future of Computer Control through Natural Language 🖥️

🎯 #ByteDance introduces GUI agent powered by #VisionLanguageModel for intuitive computer control

Code: lnkd.in/eNKasq56
Paper: lnkd.in/eN5UPQ6V
Models: lnkd.in/eVRAwA-9

#ai

🧵 ↓

0 0 1 0

Micha the DevOp

@michabbb.bsky.social

1 year ago

OmniVision-968M: World's Smallest Vision Language Model Pocket-size multimodal model with 9x token reduction for on-device deployment

Sub-1B #Vision Language Model: Introducing OmniVision-968M 🔍

#NexaAI introduces #OmniVision, 968M #VisionLanguageModel for edge devices with 9x token reduction & enhanced accuracy via #DPO. Based on #Qwen & #SigLIP architecture. Try demo on #HuggingFace

nexa.ai/blogs/omni-v...

#ai

30 5 1 0

HackerNoon

@hackernoon.com

1 year ago

Visualizing Promptable and Open-Vocabulary Segmentation Across Multiple Datasets

Explore a collection of visualizations demonstrating the effectiveness of promptable and open-vocabulary segmentation across various datasets. #visionlanguagemodel

0 0 0 0

HackerNoon

@hackernoon.com

1 year ago

Evaluating Promptable Segmentation with Uniform Point Grids and Bounding Boxes on Diverse Datasets

This evaluation explores promptable segmentation using uniform point grids and ground-truth bounding boxes across various datasets. #visionlanguagemodel

0 0 0 0

HackerNoon

@hackernoon.com

1 year ago

Advanced Open-Vocabulary Segmentation with Uni-OVSeg

Uni-OVSeg combines CLIP, multi-scale pixel decoders, and visual prompts for effective open-vocabulary segmentation, boosting weakly-supervised learning. #visionlanguagemodel

0 0 0 0

HackerNoon

@hackernoon.com

1 year ago

Uni-OVSeg: A Step Towards Efficient and Bias-Resilient Vision Systems

Uni-OVSeg advances open-vocabulary segmentation, benefiting sectors like medical imaging and autonomous vehicles while addressing the risk of AI bias in dataset #visionlanguagemodel

0 0 0 0

HackerNoon

@hackernoon.com

1 year ago

Uni-OVSeg Outperforms Weakly-Supervised and Fully-Supervised Methods in Open-Vocabulary Segmentation

Uni-OVSeg outperforms weakly-supervised and fully-supervised methods in open-vocabulary segmentation, showing superior results on datasets like PASCAL and COCO. #visionlanguagemodel

0 0 0 0

HackerNoon

@hackernoon.com

1 year ago

Uni-OVSeg: Weakly-Supervised Open-Vocabulary Segmentation with Cutting-Edge Performance

Uni-OVSeg offers a breakthrough in open-vocabulary segmentation, reducing reliance on triplets and achieving superior performance, surpassing current models.
#visionlanguagemodel

0 0 0 0

HackerNoon

@hackernoon.com

1 year ago

he Baseline and Uni-OVSeg Framework for Open-Vocabulary Segmentation

The baseline for open-vocabulary segmentation uses image-text and image-mask pairs with the CLIP model for feature extraction. #visionlanguagemodel

0 0 0 0

HackerNoon

@hackernoon.com

1 year ago

Datasets and Evaluation Methods for Open-Vocabulary Segmentation Tasks

Explore the datasets, including SA-1B and image-text pairs, used for training open-vocabulary segmentation. #visionlanguagemodel

0 0 0 0

HackerNoon

@hackernoon.com

1 year ago

The Future of Segmentation: Low-Cost Annotation Meets High Performance

Explore the evolution of segmentation techniques, from semantic to open-vocabulary segmentation, and the role of vision-language models in improving performance #visionlanguagemodel

0 0 0 0

HackerNoon

@hackernoon.com

1 year ago

Defining Open-Vocabulary Segmentation: Problem Setup, Baseline, and the Uni-OVSeg Framework

Get familiar with the open-vocabulary segmentation problem, where the aim is to segment images into masks associated with unseen semantic categories. #visionlanguagemodel

0 0 0 0