AI with Papers - Artificial Intelligence & Deep Learning

Kanalga Telegram’da o‘tish

All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/ #AI #chatGPT

Ko'proq ko'rsatish

Malayziya2 198 Texnologiyalar & Aralashmalar7 629...

📈 Telegram kanali AI with Papers - Artificial Intelligence & Deep Learning analitikasi

AI with Papers - Artificial Intelligence & Deep Learning (@ai_deeplearning) Ingliz til segmentidagi kanali faol ishtirokchi. Hozirda hamjamiyat 17 055 obunachidan iborat bo'lib, Texnologiyalar & Aralashmalar toifasida 7 629-o'rinni va Malayziya mintaqasida 2 198-o'rinni egallagan.

📊 Auditoriya ko‘rsatkichlari va dinamika

невідомо sanasidan buyon loyiha tez o‘sib, 17 055 obunachiga ega bo‘ldi.

14 Iyul, 2026 dagi oxirgi ma’lumotlarga ko‘ra kanal barqaror faollikka ega. Oxirgi 30 kunda obunachilar soni -138 ga, so‘nggi 24 soatda esa -1 ga o‘zgardi va umumiy qamrov yuqori darajada qolmoqda.

Tasdiqlash holati: Tasdiqlanmagan
Jalb etish (ER): Auditoriya o‘rtacha 18.73% darajada jalb etiladi. Nashrdan keyingi dastlabki 24 soatda kontent odatda umumiy obunachilar sonining 7.49% ini tashkil etuvchi reaksiyalarni to‘playdi.
Post qamrovi: Har bir post o‘rtacha 3 195 marta ko‘riladi; birinchi sutkada odatda 1 278 ta ko‘rish yig‘iladi.
Reaksiyalar va o‘zaro ta’sir: Auditoriya faol: har bir postga o‘rtacha 16 ta reaksiya keladi.
Tematik yo‘nalishlar: Kontent framework, object, dataset, tba, depth kabi asosiy mavzularga jamlangan.

📝 Tavsif va kontent siyosati

Muallif resursni shaxsiy fikrni ifoda etish maydoni sifatida ta’riflaydi:
“All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/ #AI #chatGPT”

Yuqori yangilanish chastotasi (oxirgi ma’lumot 15 Iyul, 2026 da olingan) sababli kanal doimo dolzarb va katta qamrovli bo‘lib qoladi. Analitika auditoriya kontent bilan faol hamkorlik qilishini, uni Texnologiyalar & Aralashmalar toifasidagi muhim ta’sir nuqtasiga aylantirishini ko‘rsatadi.

17 055

Obunachilar

-124 soatlar

-177 kunlar

-13830 kunlar

3 195

Post ko'rishlar

~ 1 27824 soatlar

~ 1 59048 soatlar

18.73%

Muloqot nisbati

Ma'lumot yo'q

Kuniga postlar

Ads index

beta

Postlar arxiv

17 055

🦧 MonkeyOCRv2 is out! 🦧 👉MonkeyOCRv2 is a text-centric visual foundation model that unifies fine-grained text modeling, cross-task representation learning, and cross-lingual generalization in a single encoder. Released for academic research and non-commercial use💙 👉Review https://t.ly/yicEK 👉Paper https://arxiv.org/pdf/2607.11562 👉Repo https://github.com/Yuliang-Liu/MonkeyOCRv2

17 055

🎂REMIND: long-term MOT re-ID🎂 👉REMIND by CVAR-UPM is a novel online tracker designed for long-term multi-object re-ID of generic indoor objects from monocular RGB, requiring neither camera pose nor depth. Repo under MIT💙 👉Review https://t.ly/AkQoI 👉Paper https://lnkd.in/dm58mkCv 👉Project https://lnkd.in/dZrAZqFe 👉Repo https://lnkd.in/dbidrwxU

17 055

🌔Foundation Global SFM🌔 👉Glob3R is a global SfM-style reconstruction built on 3D foundation models. key idea: explicitly optimize feed-forward geometric predictions. Repo TBA💙 👉Review https://t.ly/Z_4C7 👉Paper https://arxiv.org/pdf/2607.09225 👉Project https://junyuandeng.github.io/Glob3r/ 👉Repo TBA

17 055

💋SAM-MT: Real-Time Multi-Target VOS💋 👉Fudan & Shangai unveil SAM-MT, an efficient interactive multi-target video segmentation framework that maintains near-single-object efficiency (FPS/VRAM) as target count increases, while maintaining robust video segmentation performance. Repo available💙 👉Review https://t.ly/Z_4C7 👉Paper https://lnkd.in/dvS-iyBD 👉Project https://lnkd.in/daQ8na8T 👉Repo https://lnkd.in/dgbX2tZv

17 055

🔥ZipDepth: Depth on Any Device🔥 👉ZipDepth from UniBO is a compact monocular depth network that bridges this gap by combining an efficient reparameterizable encoder-decoder with large-scale knowledge distillation from a foundation model. Repo under MIT💙 👉Review https://t.ly/qYrLZ 👉Paper https://arxiv.org/pdf/2607.08771 👉Project https://zipdepth.github.io/ 👉Repo https://github.com/fabiotosi92/ZipDepth

17 055

🏵️SoccerNet 2026 Results🏵️ 👉The SoccerNet 2026 Challenges constitute the sixth annual edition of the SoccerNet open benchmarking effort, dedicated to advancing computer vision research in sports video understanding💙 👉Review https://t.ly/sfD4T 👉Paper https://lnkd.in/dSBgW_3s 👉Project https://lnkd.in/dfdmuvG8

17 055

🐈‍⬛Spatial-perception native ViT🐈‍⬛ 👉LingBot-Vision, a vision foundation model pretrained to be spatial-perception native. Better than 7x bigger foundational models. Repo under Apache💙 👉Review https://t.ly/9xIso 👉Paper https://arxiv.org/pdf/2607.05247 👉Project https://technology.robbyant.com/lingbot-vision 👉Repo https://github.com/robbyant/lingbot-vision

17 055

🏯Worldwide Semantic Facade🏯 👉A centimeter-accurate / cross-continental facade point clouds, with fine-grained semantic segmentation of architectural elements, and hierarchical facade taxonomy. 2.7B Dataset💙 👉Review https://t.ly/PpyFD 👉Paper https://arxiv.org/pdf/2607.02018 👉Project jiangyuanwangyi.github.io/UnderOneFacade_official 👉Data drive.google.com/drive/folders/1Yzz7PmyeK1qeOtkTFCfkbw7IEHXcMJo8

17 055

🔥Nvidia SpatialClaw is out🔥 👉From Nvidia a novel training-free framework for spatial reasoning that adopts code as the action interface. SpatialClaw lets a VLM-backed agent write Python in a persistent kernel, composing perception modules, inspecting intermediate results, and revising its strategy across steps. Impressive: +11.2 points on 20 benchmarks💙 👉Review https://t.ly/7JB0x 👉Paper https://arxiv.org/pdf/2606.13673 👉Project https://spatialclaw.github.io/ 👉Repo https://github.com/NVlabs/SpatialClaw

17 055

🌒LUNA: Universal 3D Human Animation🌔 👉LUNA by HKUST + META is a novel LBS-free universal neural animation model that directly maps multiple 2D controls like images, keypoints, sketch and unseen characters into 3D-G deformations, bypassing explicit body fitting. 👉Review https://t.ly/ZX9Ex 👉Paper https://arxiv.org/pdf/2606.31981 👉Project https://penghtyx.github.io/LUNA/ 👉Repo N/A 🥲

17 055

🛸PriorEye: Geospatial Self-Driving🛸 👉MRG (Oxford) introduces geospatial visual priors to leverage the street-level images in autonomous driving. Consistent improvement in performance. Repo under Apache💙 👉Review https://t.ly/7Jgav 👉Paper https://lnkd.in/dYeD2m7n 👉Project https://lnkd.in/dWJvNemr 👉Repo https://lnkd.in/dNExGGtx

17 055

🍀OctoSense: Open Sensing🍀 👉OctoSense is an open-source sensor platform with stereo RGB and event cameras, LiDAR, a thermal camera, an inertial measurement unit, RTK-corrected global positioning system, and proprioception. 👉Review https://t.ly/oFN8L 👉Paper https://lnkd.in/dM3zpyju 👉Project https://lnkd.in/ddrQ3uJ6 👉Repo https://lnkd.in/dhSDjSfG

17 055

👋 Hi everyone! Over the past few weeks, the number of join requests has increased dramatically, which unfortunately also means a much higher number of spam and bots (in the last days around five hundreds been cut off) To help me distinguish real people from fake profiles - and avoid rejecting genuine requests by mistake - I'd really appreciate if your profile includes: 📷 A real profile photo 👤 Your full name (or something reasonably identifiable) 💬 If you contact me, please use English if possible. I don't speak Russian, Arabic, or Chinese, so if your profile and messages are only in those languages, it's very difficult for me to tell whether you're a real person or an automated account. Thank you for your understanding and for helping keep this damn community welcoming and spam-free! With love, Alessandro 😈

17 055

🔊VolHuMe - Volumetric Human Meshes🔊 👉VolHuMe (H/T @Martinella_94) is a novel, high-resolution large-scale dataset of volumetric human meshes with complete 4D GT: multi-view RGB-D, textured meshes, dense point clouds, normal maps, rigged assets, garment segmentation, and SMPL-X fittings in one dataset. Insane💙 👉Review https://t.ly/b5vxy 👉Paper https://arxiv.org/pdf/2606.23062 👉Project giuli13.github.io/volhume-website/# 👉Repo TBA soon

17 055

🕷️Human Universal Grasping🕷️ 👉HUG is a flow-matching model that generates diverse human grasps for any user-specified object in a single RGB-D image captured from a stereo camera. 👉Review https://t.ly/VG1Eu 👉Paper https://arxiv.org/pdf/2606.17054 👉Repo https://github.com/KevinyWu/hug 👉Project https://grasping.io/

17 055

🔍 Nvidia Locate Anything 🔍 👉Diverse localization tasks under a unified vision-language model, including document understanding, GUI grounding, dense detection, and OCR. Repo released💙 👉Review https://t.ly/PvwFo 👉Paper https://lnkd.in/dWfNpzPZ 👉Project https://lnkd.in/dM89BX-8 👉Repo https://lnkd.in/dC4KCQSM

17 055

🪔Latent Decoding with Pixel Diffusion🪔 👉PiD by Nvidia is a plug-and-play diffusion decoder that replaces VAE/RAE decoders, turning latent representations directly into super-resolved pixels in a single pass. Repo under Apache 2.0💙 👉Review https://t.ly/y19mA 👉Paper https://lnkd.in/duVC25C2 👉Project https://lnkd.in/dW6TkzCB 👉Repo https://lnkd.in/dnGdgKRr

17 055

🍒Count Anything, Any Granularity🍒 👉Open-world counting as multi-grained counting, where visual exemplars specify target appearance and fine-grained text specifies the intended semantic granularity across five explicit levels. Repo/Data under Apache💙 👉Review https://t.ly/nqz80 👉Paper https://lnkd.in/dp7khTRU 👉Project https://lnkd.in/d_jfX_Yn 👉Repo https://lnkd.in/dkTRGZkG 👉Data https://lnkd.in/dB83jRyT

17 055

🦄Unified Correspondence Transformer🦄 👉UniCorrn is the first correspondence model with shared weights that unifies 2D-2D, 2D-3D, and 3D-3D geometric matching with an end-to-end transformer architecture. Repo under CC BY-NC-SA 4.0💙 👉Review https://t.ly/2OBdq 👉Paper https://arxiv.org/pdf/2605.04044 👉Project https://neu-vi.github.io/UniCorrn/ 👉Repo https://github.com/neu-vi/UniCorrn

17 055

About the frequency of posting in the channel:

Anonymous voting