AI with Papers - Artificial Intelligence & Deep Learning

Open in Telegram

All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/ #AI #chatGPT

Malaysia2 234 Technologies & Applications7 718...

📈 Analytical overview of Telegram channel AI with Papers - Artificial Intelligence & Deep Learning

Channel AI with Papers - Artificial Intelligence & Deep Learning (@ai_deeplearning) in the English language segment is an active participant. Currently, the community unites 17 168 subscribers, ranking 7 718 in the Technologies & Applications category and 2 234 in the Malaysia region.

📊 Audience metrics and dynamics

Since its creation on невідомо, the project has demonstrated rapid growth, gathering an audience of 17 168 subscribers.

According to the latest data from 20 June, 2026, the channel demonstrates stable activity. Although there has been a change in the number of participants by -169 over the last 30 days and by 0 over the last 24 hours, overall reach remains high.

Verification status: Not verified
Engagement rate (ER): The average audience engagement rate is 22.86%. Within the first 24 hours after publication, content typically collects N/A% reactions from the total number of subscribers.
Post reach: On average, each post receives 3 926 views. Within the first day, a publication typically gains 0 views.
Reactions and interaction: The audience actively supports content: the average number of reactions per post is 26.
Thematic interests: Content is focused on key topics such as framework, object, dataset, tba, depth.

📝 Description and content policy

The author describes the resource as a platform for expressing subjective opinions:
“All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/ #AI #chatGPT”

Thanks to the high frequency of updates (latest data received on 21 June, 2026), the channel maintains relevance and a high level of publication reach. Analytics show that the audience actively interacts with content, making it an important point of influence in the Technologies & Applications category.

17 168

Subscribers

No data24 hours

-357 days

-16930 days

3 926

Post views

No data24 hours

No data48 hours

22.86%

Engagement rate

No data

Posts per day

Ads index

beta

Posts Archive

17 168

🦗Character Mixing Generation🦗 👉MBZUAI unveils the first ever video-gen system able to preserve character ID, behavior & original style while generating plausible interactions between characters that have never coexisted - from cartoons (We Bare Bears, Tom & Jerry) to realistic humans (Mr. Bean, Young Sheldon) 👉Review https://t.ly/tN84a 👉Paper https://lnkd.in/dhKMwukv 👉Project https://lnkd.in/dBkJs48h 👉Repo https://lnkd.in/dw_uzgAk

17 168

🐠ITTO: Protocol for Dynamic Tracking🐠 👉ITTO by Caltech is a novel long-range tracking benchmark suite for evaluating and diagnosing tracking methods on complex and long-range motions. Repo under CC BY-NC 4.0💙 👉Review https://t.ly/tN84a 👉Paper https://arxiv.org/pdf/2510.19819 👉Project https://glab-caltech.github.io/ITTO/ 👉Repo https://github.com/ilonadem/itto

17 168

🏜️Omni Driving Navigation Models🏜️ 👉OmniNWM is a unified panoramic navigation world model that advances autonomous driving by jointly generating multi-modal states (RGB, semantics, depth, 3D occupancy), enabling precise action control & facilitating closed-loop evaluation through occupancy-based dense rewards. Repo under Apache 2.0💙 👉Review https://t.ly/ktXvz 👉Paper https://lnkd.in/eFKSZnrc 👉Project https://lnkd.in/eSDfccv8 👉Repo https://lnkd.in/efCSvjtp

17 168

Repo (pretty empty) now online: https://github.com/OatmealLiu/UrbanVerse

17 168

🔥 SAM 2++: Track Anything 🔥 👉SAM 2++ is a novel unified model towards tracking at any granularity, including masks, boxes, and points. Impressive results but no code announced yet 😢 👉Review https://t.ly/I392_ 👉Paper arxiv.org/pdf/2510.18822 👉Project tracking-any-granularity.github.io/ 👉Repo :(

17 168

🌵All-in-One Dense Keypoints🌵 👉DeepDetect is a novel all-in-one, dense keypoints detector that unifies the strengths of SIFT, ORB, BRISK, FAST, AGAST, Harris, Shi-Tomasi, Canny & Sobel into a neural net. DAMN ROMANTIC. Repo under MIT💙 👉Review https://t.ly/VKGct 👉Paper https://arxiv.org/pdf/2510.17422 👉Repo https://github.com/saktx/DeepDetect

17 168

🦄 City-Tour -> Simulation 🦄 👉UrbanVerse is a novel system to convert real-world urban scenes from city-tour videos into physics-aware, interactive simulation environments, enabling scalable robot learning in urban spaces with real-world generalization. Repo & Data announced 💙 👉Review https://t.ly/UvXNS 👉Paper https://arxiv.org/pdf/2510.15018 👉Project https://urbanverseproject.github.io/ 👉Repo TBA

17 168

🫙Universal Feature Up-Sampling🫙 👉AnyUp is a novel method for feature up-sampling that can be applied to ANY vision feature at ANY resolution, without encoder-specific training: inference-time feature-agnostic up-sampling architecture to improve up-sampling quality. Repo under CC-4.0💙 👉Review https://t.ly/HvEw9 👉Paper https://arxiv.org/pdf/2510.12764 👉Project https://wimmerth.github.io/anyup/ 👉Repo https://github.com/wimmerth/anyup

17 168

🫧🫧 Detect Anything via MLLM 🫧🫧 👉Rex-Omni is a 3B-multimodal model that unifies visual perception tasks, including object detection, OCR, pointing, key-pointing & visual prompting into a single next point prediction framework. Impressive results. Full repo under IDEA License 1.0💙 👉Review https://t.ly/DCTk_ 👉Paper https://lnkd.in/d4VDD-9j 👉Project https://lnkd.in/d6unEyvq 👉Repo https://lnkd.in/dkYJFe-x

17 168

↗️ TrackVLA++ Visual Tracking↘️ 👉TrackVLA++ is a novel Vision-Language-Action model that incorporates spatial reasoning and target identification memory, enabling SOTA performance in both long-horizon and highly crowded tracking scenarios. Model announced💙 👉Review https://t.ly/ruYzc 👉Paper https://arxiv.org/pdf/2510.07134 👉Project pku-epic.github.io/TrackVLA-plus-plus-Web/ 👉Repo TBA

17 168

💄Pixel-Perfect Depth (SOTA)💄 👉Pixel-Perfect Depth is a mono-depth estimation model with pixel-space diffusion transformers. New SOTA. Repo under Apache 2.0💙 👉Review https://t.ly/75PGo 👉Paper https://lnkd.in/d8wxFpyY 👉Project https://lnkd.in/dV5HhsqH 👉Repo https://lnkd.in/d9JKFBJq 👉Demo https://lnkd.in/d3wBkKJ9

17 168

🎺Visual Grounding RVOS🎺 👉ReferDINO is a strong RVOS model that inherits region-level vision-language alignment from foundational visual grounding models, and is further endowed with pixel-level dense perception & cross-modal spatio-temporal reasoning. Code, Demo & checkpoints released💙 👉Review https://t.ly/rOdkP 👉Paper https://lnkd.in/efuAFQdE 👉Project https://lnkd.in/dK3wMZqv 👉Repo https://lnkd.in/d3i2PsNF

17 168

🎺Visual Grounding RVOS🎺 👉ReferDINO is a strong RVOS model that inherits region-level vision-language alignment from foundational visual grounding models, and is further endowed with pixel-level dense perception & cross-modal spatio-temporal reasoning. Code, Demo & checkpoints released💙 👉Review https://t.ly/rOdkP 👉Paper https://lnkd.in/efuAFQdE 👉Project https://lnkd.in/dK3wMZqv 👉Repo https://lnkd.in/d3i2PsNF

17 168

👉 A proof I'm not a bot... My (short) interview to one of the biggest Italian media: AI in 2016, HPC / Quantum and how I created my startup: https://www.linkedin.com/posts/visionarynet_ai-itw25-ai-activity-7381215486115643392-t7an Thanks for the support (and of course a new paper coming in a few hours)

17 168

🎷🎷 Clink! Chop! Thud! 🎷🎷 👉Sounding Object Detection: while an environment may contain many objects, only a few are directly involved in producing sound during an interaction. This model detects the sounding object given a video of an object interaction. Code/Data announced💙 👉Review https://t.ly/VK_1h 👉Paper https://lnkd.in/depNjVXm 👉Project https://lnkd.in/dF63EZFG 👉Repo TBA

17 168

🔩Code-Centric Agentic Education🔩 👉Show Lab unveils Code2Video: agentic, code-centric framework that generates HQ educational videos from knowledge points. Unlike pixel-based text-to-video models, this approach leverages executable Manim code to ensure clarity, coherence & reproducibility. Repo under MIT💙 👉Review https://t.ly/Fv4LJ 👉Paper https://arxiv.org/pdf/2510.01174 👉Repo https://github.com/showlab/Code2Video/ 👉Project https://showlab.github.io/Code2Video/

17 168

👩‍🦱Physical-Hair Diffusion👩‍🦱 👉CONTROLHAIR is novel hybrid framework that integrates a physics simulator with conditional video diffusion to enable controllable dynamic hair rendering. Repo announced💙 👉Review https://t.ly/78LHr 👉Paper https://lnkd.in/epm-A9Fq 👉Project https://lnkd.in/evsjz298 👉Repo TBA

17 168

👔 Universal Image Restoration 👔 👉LucidFlux by HKUSTGZ is the universal image restoration framework built on a large-scale diffusion transformer that delivers photorealistic restorations of real-world low-quality (LQ) images, outperforming SOTA diffusion-based models across diverse degradations. Repo under custom Non-Commercial License💙 👉Review https://t.ly/Z5cA3 👉Paper https://arxiv.org/pdf/2509.22414 👉Project https://w2genai-lab.github.io/LucidFlux/ 👉Repo https://github.com/W2GenAI-Lab/LucidFlux

17 168

🤖 Real-time Interactive Long Video 🤖 👉LONGLIVE by #Nvidia is a frame-level autoregressive framework for real-time & interactive long video generation. LONGLIVE accepts sequential user prompts and generates corresponding videos in real time. Repo under non-commercial license💙 👉Review https://t.ly/jJkdY 👉Paper arxiv.org/pdf/2509.22622 👉Project nvlabs.github.io/LongLive/ 👉Repo github.com/NVlabs/LongLive 🤗huggingface.co/Efficient-Large-Model/LongLive-1.3B

17 168

🔥SOTA Detection w/ DINOv3🔥 👉DEIMv2 is the evolution of DEIM framework while leveraging DINOv3. Various model sizes, from an ultra-light version up to S, M, L, & X for a wide range of scenarios. Across these variants, DEIMv2 achieves SOTA. Repo Apache2.0💙 👉Review https://t.ly/P7jEH 👉Paper arxiv.org/pdf/2509.20787 👉Repo github.com/Intellindust-AI-Lab/DEIMv2 👉Project intellindust-ai-lab.github.io/projects/DEIMv2