Artificial Intelligence l l AI Updates
Open in Telegram
News about AI & DL & ML!!! Admin: @Gayrat_Tangriberganov
Show more1 615
Subscribers
No data24 hours
+47 days
+530 days
Posts Archive
š GitHub_Link
āļø Distributed Swarm Trajectory Optimization for Formation Flight in Dense Environments.
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø PDNet: Toward Better One-Stage Object Detection With Prediction Decoupling š„
#ObjectDetection
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models
#DiffusionModels
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø Fine-tune SAM (Segment Anything Model) for computer vision tasks such as semantic segmentation, matting, detection ... in specific scenarios
#SAM
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø Prompt Segment Anything š„
#SAM
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø USE OPEN-SOURCE LLMS IN POSTGRESQL with Ollama and new open-source extension pgai
š¦ šŖšµš®š š¶š š¢š¹š¹š®šŗš®?
Ollama is an easy and popular way to use open-source language models like Llama 3, Mistral, Phi 3, and Gemma. Unlike proprietary models, open-source models are private, free (hardware costs aside), can run locally, and are customizable.
š šŖšµš®š š¶š š½š“š®š¶?
Pgai is an open-source PostgreSQL extension that integrates AI models with PostgreSQL data, simplifying AI engineering for developers familiar with PostgreSQL and facilitating RAG and search.
š§° šŖšµš®š š°š®š» š š±š¼ šš¶ššµ š½š“š®š¶ š®š»š± š¢š¹š¹š®šŗš®?
Create embeddings on PostgreSQL data using models like BERT and Llama 3, storing them in pgvector for easy search and RAG. Perform RAG and LLM reasoning tasks using models like Llama 3, Mistral, and Gemma, enabling summarization, categorization, and data enrichment via SQL queries.
#LLMs
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø TF-ID: Table/Figure IDentifier for academic papers.
Seeing the open-source community develop small, cost-effective customized vision-language models (VLMs) that outperform the much larger closed-source APIs is really impressive.
One of them is the TF-ID model by Yifei Huang. It's a fine-tuned version of Florence-2, the small but very powerful VLM by Microsoft. TF-ID (Table/Figure IDentifier) is a family of object detection models finetuned to extract tables and figures in academic papers. Interestingly, the author labeled 4600 images by hand, ensuring high data quality!
#VLMs
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild
It's really cool to see the open-source community creating small and cheap customized vision-language models (VLMs) that outperform the much larger closed-source APIs.
ChartGemma is a fine-tuned version of PaliGemma created by Megh Thakkar and team, which excels at answering questions regarding charts and plots. The idea is pretty simple: first use a closed-source API like Gemini 1.5 Flash to collect training data, then fine-tune the open PaliGemma model on it. You end up with a model that is much smaller and cheaper to run for this specific niche task, and it outperforms the closed-source APIs! š„
#VLMs
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training
#NeRF
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø OmniDrive: LLM-Agent for Autonomous Driving with 3D Perception, Reasoning and Planning
#LLMs
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø Shape of Motion: 4D Reconstruction from a Single Video
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø DataComp for Language Models
Apple has entered the game! Apple just released a 7B open-source LLM, weights, training code, and dataset! š
š§ 7B base model, trained on 2.5T tokens on an open datasets
š Primarily English data and a 2048 context window
š Combined DCLM-BASELINE, StarCoder, and ProofPile2 data
š MMLU 0.6372 > Mistral & < Llama3
š Open License with Apple Sample Code License
š Matches closed-dataset models like Mistral
š¬ Trained using PyTorch with OpenLM framework
š¤ Available on Hugging Face and in Transformers
#LLMs
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
#SceneTextSpotting
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø MetaSeg: Packaged version of the Segment Anything š„
#SAM
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø Robotic Transformer 2 (RT-2): The Vision-Language-Action Model
#Robotics
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation
#Robotics
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø OmniTokenizer: one model and one weight for image-video joint tokenization
#VideoGeneration
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø Big news. Andrej Karpathy is launching a new AI Education company called Eureka labs. Their first product will be the world's best AI course, LLM101n š„.
#LLMs
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø UW-Madison-GI-Tract-Segmentation-Data-Tools
#MedicalAi
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
š GitHub_Link
āļø PowerPaint: A Versatile Image Inpainting Model šš
Join my channel:
šššššš
https://t.me/Artificial_Intelligence_Updates
Available now! Telegram Research 2025 ā the year's key insights 
