ar
Feedback
AI with Papers - Artificial Intelligence & Deep Learning

AI with Papers - Artificial Intelligence & Deep Learning

الذهاب إلى القناة على Telegram

All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/ #AI #chatGPT

إظهار المزيد

📈 نظرة تحليلية على قناة تيليجرام AI with Papers - Artificial Intelligence & Deep Learning

تُعد قناة AI with Papers - Artificial Intelligence & Deep Learning (@ai_deeplearning) في القطاع اللغوي الإنكليزية لاعباً نشطاً. يضم المجتمع حالياً 17 168 مشتركاً، محتلاً المرتبة 7 718 في فئة التكنولوجيات والتطبيقات والمرتبة 2 234 في منطقة ماليزيا.

📊 مؤشرات الجمهور والحراك

منذ تأسيسه في невідомо، حقق المشروع نمواً سريعاً وجمع 17 168 مشتركاً.

بحسب آخر البيانات بتاريخ 20 يونيو, 2026، تحافظ القناة على نشاط مستقر. خلال آخر 30 يوماً تغيّر عدد الأعضاء بمقدار -169، وفي آخر 24 ساعة بمقدار 0، مع بقاء الوصول العام مرتفعاً.

  • حالة التحقق: غير موثّقة
  • معدل التفاعل (ER): يبلغ متوسط تفاعل الجمهور 22.86‎%. وخلال أول 24 ساعة من النشر يحصد المحتوى عادةً N/A‎% من ردود الفعل نسبةً إلى إجمالي المشتركين.
  • وصول المنشورات: يحصل كل منشور على متوسط 3 926 مشاهدة. وخلال اليوم الأول يجمع عادةً 0 مشاهدة.
  • التفاعلات والاستجابة: يتفاعل الجمهور بانتظام؛ متوسط التفاعلات لكل منشور يبلغ 26.
  • الاهتمامات الموضوعية: يركز المحتوى على مواضيع رئيسية مثل framework, object, dataset, tba, depth.

📝 الوصف وسياسة المحتوى

يصف المؤلف القناة بأنها مساحة للتعبير عن الآراء الذاتية:
All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/ #AI #chatGPT

بفضل وتيرة التحديث المرتفعة (أحدث البيانات بتاريخ 21 يونيو, 2026) تحافظ القناة على حداثتها ومستوى وصول مرتفع. وتُظهر التحليلات تفاعلاً نشطاً من الجمهور، ما يجعلها نقطة تأثير مهمة ضمن فئة التكنولوجيات والتطبيقات.

17 168
المشتركون
لا توجد بيانات24 ساعات
-357 أيام
-16930 أيام
أرشيف المشاركات
🦗Character Mixing Generation🦗 👉MBZUAI unveils the first ever video-gen system able to preserve character ID, behavior & original style while generating plausible interactions between characters that have never coexisted - from cartoons (We Bare Bears, Tom & Jerry) to realistic humans (Mr. Bean, Young Sheldon) 👉Review https://t.ly/tN84a 👉Paper https://lnkd.in/dhKMwukv 👉Project https://lnkd.in/dBkJs48h 👉Repo https://lnkd.in/dw_uzgAk

🐠ITTO: Protocol for Dynamic Tracking🐠 👉ITTO by Caltech is a novel long-range tracking benchmark suite for evaluating and diagnosing tracking methods on complex and long-range motions. Repo under CC BY-NC 4.0💙 👉Review https://t.ly/tN84a 👉Paper https://arxiv.org/pdf/2510.19819 👉Project https://glab-caltech.github.io/ITTO/ 👉Repo https://github.com/ilonadem/itto

🏜️Omni Driving Navigation Models🏜️ 👉OmniNWM is a unified panoramic navigation world model that advances autonomous driving by jointly generating multi-modal states (RGB, semantics, depth, 3D occupancy), enabling precise action control & facilitating closed-loop evaluation through occupancy-based dense rewards. Repo under Apache 2.0💙 👉Review https://t.ly/ktXvz 👉Paper https://lnkd.in/eFKSZnrc 👉Project https://lnkd.in/eSDfccv8 👉Repo https://lnkd.in/efCSvjtp

🔥 SAM 2++: Track Anything 🔥 👉SAM 2++ is a novel unified model towards tracking at any granularity, including masks, boxes, and points. Impressive results but no code announced yet 😢 👉Review https://t.ly/I392_ 👉Paper arxiv.org/pdf/2510.18822 👉Project tracking-any-granularity.github.io/ 👉Repo :(

🌵All-in-One Dense Keypoints🌵 👉DeepDetect is a novel all-in-one, dense keypoints detector that unifies the strengths of SIF
+2
🌵All-in-One Dense Keypoints🌵 👉DeepDetect is a novel all-in-one, dense keypoints detector that unifies the strengths of SIFT, ORB, BRISK, FAST, AGAST, Harris, Shi-Tomasi, Canny & Sobel into a neural net. DAMN ROMANTIC. Repo under MIT💙 👉Review https://t.ly/VKGct 👉Paper https://arxiv.org/pdf/2510.17422 👉Repo https://github.com/saktx/DeepDetect

🦄 City-Tour -> Simulation 🦄 👉UrbanVerse is a novel system to convert real-world urban scenes from city-tour videos into physics-aware, interactive simulation environments, enabling scalable robot learning in urban spaces with real-world generalization. Repo & Data announced 💙 👉Review https://t.ly/UvXNS 👉Paper https://arxiv.org/pdf/2510.15018 👉Project https://urbanverseproject.github.io/ 👉Repo TBA

🫙Universal Feature Up-Sampling🫙 👉AnyUp is a novel method for feature up-sampling that can be applied to ANY vision feature at ANY resolution, without encoder-specific training: inference-time feature-agnostic up-sampling architecture to improve up-sampling quality. Repo under CC-4.0💙 👉Review https://t.ly/HvEw9 👉Paper https://arxiv.org/pdf/2510.12764 👉Project https://wimmerth.github.io/anyup/ 👉Repo https://github.com/wimmerth/anyup

🫧🫧 Detect Anything via MLLM 🫧🫧 👉Rex-Omni is a 3B-multimodal model that unifies visual perception tasks, including object detection, OCR, pointing, key-pointing & visual prompting into a single next point prediction framework. Impressive results. Full repo under IDEA License 1.0💙 👉Review https://t.ly/DCTk_ 👉Paper https://lnkd.in/d4VDD-9j 👉Project https://lnkd.in/d6unEyvq 👉Repo https://lnkd.in/dkYJFe-x

↗️ TrackVLA++ Visual Tracking↘️ 👉TrackVLA++ is a novel Vision-Language-Action model that incorporates spatial reasoning and target identification memory, enabling SOTA performance in both long-horizon and highly crowded tracking scenarios. Model announced💙 👉Review https://t.ly/ruYzc 👉Paper https://arxiv.org/pdf/2510.07134 👉Project pku-epic.github.io/TrackVLA-plus-plus-Web/ 👉Repo TBA

💄Pixel-Perfect Depth (SOTA)💄 👉Pixel-Perfect Depth is a mono-depth estimation model with pixel-space diffusion transformers. New SOTA. Repo under Apache 2.0💙 👉Review https://t.ly/75PGo 👉Paper https://lnkd.in/d8wxFpyY 👉Project https://lnkd.in/dV5HhsqH 👉Repo https://lnkd.in/d9JKFBJq 👉Demo https://lnkd.in/d3wBkKJ9

🎺Visual Grounding RVOS🎺 👉ReferDINO is a strong RVOS model that inherits region-level vision-language alignment from foundational visual grounding models, and is further endowed with pixel-level dense perception & cross-modal spatio-temporal reasoning. Code, Demo & checkpoints released💙 👉Review https://t.ly/rOdkP 👉Paper https://lnkd.in/efuAFQdE 👉Project https://lnkd.in/dK3wMZqv 👉Repo https://lnkd.in/d3i2PsNF

🎺Visual Grounding RVOS🎺 👉ReferDINO is a strong RVOS model that inherits region-level vision-language alignment from foundational visual grounding models, and is further endowed with pixel-level dense perception & cross-modal spatio-temporal reasoning. Code, Demo & checkpoints released💙 👉Review https://t.ly/rOdkP 👉Paper https://lnkd.in/efuAFQdE 👉Project https://lnkd.in/dK3wMZqv 👉Repo https://lnkd.in/d3i2PsNF

👉 A proof I'm not a bot... My (short) interview to one of the biggest Italian media: AI in 2016, HPC / Quantum and how I cre
👉 A proof I'm not a bot... My (short) interview to one of the biggest Italian media: AI in 2016, HPC / Quantum and how I created my startup: https://www.linkedin.com/posts/visionarynet_ai-itw25-ai-activity-7381215486115643392-t7an Thanks for the support (and of course a new paper coming in a few hours)

🎷🎷 Clink! Chop! Thud! 🎷🎷 👉Sounding Object Detection: while an environment may contain many objects, only a few are directly involved in producing sound during an interaction. This model detects the sounding object given a video of an object interaction. Code/Data announced💙 👉Review https://t.ly/VK_1h 👉Paper https://lnkd.in/depNjVXm 👉Project https://lnkd.in/dF63EZFG 👉Repo TBA

🔩Code-Centric Agentic Education🔩 👉Show Lab unveils Code2Video: agentic, code-centric framework that generates HQ educational videos from knowledge points. Unlike pixel-based text-to-video models, this approach leverages executable Manim code to ensure clarity, coherence & reproducibility. Repo under MIT💙 👉Review https://t.ly/Fv4LJ 👉Paper https://arxiv.org/pdf/2510.01174 👉Repo https://github.com/showlab/Code2Video/ 👉Project https://showlab.github.io/Code2Video/

👩‍🦱Physical-Hair Diffusion👩‍🦱 👉CONTROLHAIR is novel hybrid framework that integrates a physics simulator with conditional video diffusion to enable controllable dynamic hair rendering. Repo announced💙 👉Review https://t.ly/78LHr 👉Paper https://lnkd.in/epm-A9Fq 👉Project https://lnkd.in/evsjz298 👉Repo TBA

👔 Universal Image Restoration 👔 👉LucidFlux by HKUSTGZ is the universal image restoration framework built on a large-scale diffusion transformer that delivers photorealistic restorations of real-world low-quality (LQ) images, outperforming SOTA diffusion-based models across diverse degradations. Repo under custom Non-Commercial License💙 👉Review https://t.ly/Z5cA3 👉Paper https://arxiv.org/pdf/2509.22414 👉Project https://w2genai-lab.github.io/LucidFlux/ 👉Repo https://github.com/W2GenAI-Lab/LucidFlux

🤖 Real-time Interactive Long Video 🤖 👉LONGLIVE by #Nvidia is a frame-level autoregressive framework for real-time & interactive long video generation. LONGLIVE accepts sequential user prompts and generates corresponding videos in real time. Repo under non-commercial license💙 👉Review https://t.ly/jJkdY 👉Paper arxiv.org/pdf/2509.22622 👉Project nvlabs.github.io/LongLive/ 👉Repo github.com/NVlabs/LongLive 🤗huggingface.co/Efficient-Large-Model/LongLive-1.3B

🔥SOTA Detection w/ DINOv3🔥 👉DEIMv2 is the evolution of DEIM framework while leveraging DINOv3. Various model sizes, from an ultra-light version up to S, M, L, & X for a wide range of scenarios. Across these variants, DEIMv2 achieves SOTA. Repo Apache2.0💙 👉Review https://t.ly/P7jEH 👉Paper arxiv.org/pdf/2509.20787 👉Repo github.com/Intellindust-AI-Lab/DEIMv2 👉Project intellindust-ai-lab.github.io/projects/DEIMv2