fa
Feedback
AI with Papers - Artificial Intelligence & Deep Learning

AI with Papers - Artificial Intelligence & Deep Learning

رفتن به کانال در Telegram

All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/ #AI #chatGPT

نمایش بیشتر

📈 تحلیل کانال تلگرام AI with Papers - Artificial Intelligence & Deep Learning

کانال AI with Papers - Artificial Intelligence & Deep Learning (@ai_deeplearning) در بخش زبانی انگلیسی بازیگری فعال است. در حال حاضر جامعه شامل 17 168 مشترک است و جایگاه 7 718 را در دسته فناوری و برنامه‌ها و رتبه 2 234 را در منطقه ماليزيا دارد.

📊 شاخص‌های مخاطب و پویایی

از زمان ایجاد در невідомо، پروژه رشد سریعی داشته و 17 168 مشترک جذب کرده است.

بر اساس آخرین داده‌ها در تاریخ 20 ژوئن, 2026، کانال فعالیت پایداری دارد. در ۳۰ روز گذشته تغییر اعضا برابر -169 و در ۲۴ ساعت گذشته برابر 0 بوده و همچنان دسترسی گسترده‌ای حفظ شده است.

  • وضعیت تأیید: تأیید نشده
  • نرخ تعامل (ER): میانگین تعامل مخاطب 22.86% است و در ۲۴ ساعت نخست پس از انتشار، محتوا معمولاً N/A% واکنش نسبت به کل مشترکان کسب می‌کند.
  • دسترسی پست‌ها: هر پست به طور میانگین 3 926 بازدید دریافت می‌کند. در اولین روز معمولاً 0 بازدید جمع‌آوری می‌شود.
  • واکنش‌ها و تعامل: مخاطبان به‌طور فعال حمایت می‌کنند؛ میانگین واکنش به هر پست 26 است.
  • علایق موضوعی: محتوا بر موضوعات کلیدی مانند framework, object, dataset, tba, depth تمرکز دارد.

📝 توضیح و سیاست محتوایی

نویسنده این فضا را محل بیان دیدگاه‌های شخصی توصیف می‌کند:
All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/ #AI #chatGPT

به لطف به‌روزرسانی‌های پرتکرار (آخرین داده در تاریخ 21 ژوئن, 2026)، کانال همواره به‌روز و دارای دسترسی بالاست. تحلیل‌ها نشان می‌دهد مخاطبان به‌طور فعال با محتوا تعامل دارند و آن را به نقطه اثرگذاری مهم در دسته فناوری و برنامه‌ها تبدیل کرده‌اند.

17 168
مشترکین
اطلاعاتی وجود ندارد24 ساعت
-357 روز
-16930 روز
آرشیو پست ها
🔥 Diffusion Model <-> Depth 🔥 👉ETH & CMU on how to turn a single-image latent diffusion model (LDM) into the SOTA video depth estimator: video depth without video models. Repo released under Apache 2.0 and HF demo available💙 👉Review https://t.ly/sP9ma 👉Paper arxiv.org/pdf/2411.19189 👉Project rollingdepth.github.io/ 👉Repo github.com/prs-eth/rollingdepth 🤗Demo huggingface.co/spaces/prs-eth/rollingdepthhttps://t.ly/sP9ma

🔥 S3MOT: SOTA 3D MOT 🔥 👉Wuhan University unveils S3MOT, a Selective-State-Space model-based MOT that efficiently infers 3D motion and object associations from 2D images through three core components. New SOTA on KITTI with 76.86 HOTA at 31 FPS! Code & Weights to be released under MIT license💙 👉Review https://t.ly/H_JPv 👉Paper https://arxiv.org/pdf/2504.18068 👉Repo https://github.com/bytepioneerX/s3mot

🍏#Nvidia Dynamic Pose 🍏 👉Nvidia unveils DynPose-100K, the largest dataset of dynamic Internet videos annotated with camera poses. Dataset released under Nvidia license💙 👉Review https://t.ly/wrcb0 👉Paper https://lnkd.in/dycGjAyy 👉Project https://lnkd.in/dDZ2Ej_Q 🤗Data https://lnkd.in/d8yUSB7m

🌼SOTA Textured 3D-Guided VTON🌼 👉#ALIBABA unveils 3DV-TON, a novel diffusion model for HQ and temporally consistent video. Generating animatable textured 3D meshes as explicit frame-level guidance, alleviating the issue of models over-focusing on appearance fidelity at the expanse of motion coherence. Code & benchmark to be released💙 👉Review https://t.ly/0tjdC 👉Paper https://lnkd.in/dFseYSXz 👉Project https://lnkd.in/djtqzrzs 👉Repo TBA

📍Moving Points -> Depth📍 👉KAIST & Adobe propose Seurat, a novel method that infers relative depth by examining the spatial relationships and temporal evolution of a set of tracked 2D trajectories (via off-the-shelf point tracking models). Repo & Demo to be released💙 👉Review https://t.ly/qA2P5 👉Paper https://lnkd.in/dpXDaQtM 👉Project https://lnkd.in/d9qWYsjP 👉Repo https://lnkd.in/dZEMDiJh

🦧 #Nvidia Describe Anything 🦧 👉Nvidia unveils Describe Anything Model (DAM) the new SOTA in generating detailed descriptions for user-specified regions in images/videos, marked by points, boxes, scribbles, or masks. Repo under Apache, Dataset available and live demo on 🤗 👉Review https://t.ly/la4JD 👉Paper https://lnkd.in/dZh82xtV 👉Project https://lnkd.in/dcv9V2ZF 👉Repo https://lnkd.in/dJB9Ehtb 🤗Demo https://lnkd.in/dXDb2MWU

🧊TAP in Persistent 3D Geometry🧊 👉TAPIP3D is the novel SOTA for long-term 3D point tracking in mono-RGB/RGB-D. Videos as camera-stabilized spatio-temporal feature clouds, leveraging depth & motion to lift 2D video feats into a 3D world space where camera motion is effectively canceled. Code under Apache💙 👉Review https://t.ly/oooMy 👉Paper https://lnkd.in/d8uqjdE4 👉Project https://tapip3d.github.io/ 👉Repo https://lnkd.in/dsvHP_8u

🔥 #Apple Co-Motion is out! 🔥 👉Apple unveils a novel approach for detecting & tracking detailed 3D poses of multiple people from single monocular stream. Temporally coherent predictions in crowded scenes with hard poses & occlusions. New SOTA, 10x faster! Code & Models released only for research💙 👉Review https://t.ly/-86CO 👉Paper https://lnkd.in/dQsVGY7q 👉Repo https://lnkd.in/dh7j7N89

🔍Event Blurry Super-Resolution🔍 👉USTC unveils Ev-DeblurVSR: event signals into BVSR for a novel event-enhanced network. Blurry Video Super-Resolution (BVSR) aiming at generating HR videos from low-resolution and blurry inputs. Pretrained models and test released under Apache💙 👉Review https://t.ly/x6hRs 👉Paper https://lnkd.in/dzbkCJMh 👉Repo https://lnkd.in/dmvsc-yS

🔥General attention-based object🔥 👉GATE3D is a novel framework designed specifically for generalized monocular 3D object detection via weak supervision. GATE3D effectively bridges domain gaps by employing consistency losses between 2D and 3D predictions. 👉Review https://t.ly/O7wqH 👉Paper https://lnkd.in/dc5VTUj9 👉Project https://lnkd.in/dzrt-qQV

🐯UniAnimate-DiT: Human Animation🐯 👉UniAnimate-DiT is a novel n' effective framework based on Wan2.1 for consistent human image animation. LoRAs to finetune the model parameters -reducing memory- maintaining the original model’s generative skills. Training and inference code released💙 👉Review https://t.ly/1I50N 👉Paper https://arxiv.org/pdf/2504.11289 👉Repo https://github.com/ali-vilab/UniAnimate-DiT

🍏PartField #3D Part Segmentation🍏 👉#Nvidia unveils PartField, a FFW approach for learning part-based 3D features, which captures the general concept of parts and their hierarchy. Suitable for single-shape decomposition, co-segm., correspondence & more. Code & Models released under Nvidia License💙 👉Review https://t.ly/fGb2O 👉Paper https://lnkd.in/dGeyKSzG 👉Code https://lnkd.in/dbe57XGH 👉Project https://lnkd.in/dhEgf7X2

🍄 4D Mocap Human-Object 🍄 👉#Adobe unveils HUMOTO, HQ dataset of human-object interactions for motion generation, computer vision, and robotics: 700+ sequences (7,875 seconds @ 30FPS), interactions with 63 precisely modeled objects and 72 articulated parts 👉Review https://t.ly/lCof3 👉Paper https://lnkd.in/dVVBDd_c 👉Project https://lnkd.in/dwBcseDf

💥Geo4D: VideoGen 4D Scene💥 👉The Oxford VGG unveils Geo4D: video diffusion for monocular 4D reconstruction. Only synthetic data for training, but strong generalization to real world: point maps, depth & ray maps for the new SOTA in dynamic reconstruction. Code released💙 👉Review https://t.ly/X55Uj 👉Paper arxiv.org/pdf/2504.07961 👉Project geo4d.github.io/ 👉Code github.com/jzr99/Geo4D

🥊 Pose in Combat Sports 🥊 👉The novel SOTA framework for an accurate physics-based #3D human pose estimation in combat sports w/ sparse multi-cameras setup. Dataset to be released soon💙 👉Review https://t.ly/EfcGL 👉Paper https://lnkd.in/deMMrKcA 👉Project https://lnkd.in/dkMS_UrH

🧊BoxDreamer Object Pose🧊 👉BoxDreamer is a generalizable RGB-based approach for #3D object pose estimation in the wild, specifically designed to address challenges in sparse-view settings. Code coming, demo released💙 👉Review https://t.ly/e-vX9 👉Paper arxiv.org/pdf/2504.07955 👉Project https://lnkd.in/djz8jqn9 👉Repo https://lnkd.in/dfuEawSA 🤗Demo https://lnkd.in/dVYaWGcS

💛 Unified Scalable SVG Generator 💛 👉OmniSVG is the first family of e2e multimodal generators that leverages pre-trained VLMs to create detailed SVGs. Code, models & dataset to be released under MIT💙 👉Review https://t.ly/JcR3I 👉Paper https://arxiv.org/pdf/2504.06263 👉Project https://omnisvg.github.io/ 👉Repo github.com/OmniSVG/OmniSVG 👉Dataset https://huggingface.co/OmniSVG

🐈 TTT Long Video Generation🐈 👉A novel architecture for video generation adapting the CogVideoX 5B model by incorporating Test-Time Training layers. Adding TTT layers into a pre-trained Transformer -> one-minute clip from text storyboards. Videos, code & annotations released💙 👉Review https://t.ly/mhlTN 👉Paper arxiv.org/pdf/2504.05298 👉Project test-time-training.github.io/video-dit/ 👉Repo github.com/test-time-training/ttt-video-dit

⛽ VoRA: Vision as LoRA ⛽ 👉#ByteDance unveils Vision as LoRA (VoRA), a novel paradigm converting LLMs into Multimodal Large Language Models (MLLMs) by integrating vision-specific LoRA layers. All training data, codes, and model weights available💙 👉Review https://t.ly/guNVN 👉Paper arxiv.org/pdf/2503.20680 👉Repo github.com/Hon-Wong/VoRA 👉Project georgeluimmortal.github.io/vora-homepage.github.io/

🌳 Compose Anything is out 🌳 👉Skywork AI unveils SkyReels-A2, a controllable video generation framework capable of assembling arbitrary visual elements (e.g., characters, objects, backgrounds) into synthesized videos based on textual prompts. Code, models, & evaluation benchmark released💙 👉Review https://t.ly/MEjzL 👉Paper https://arxiv.org/pdf/2504.02436 👉Project skyworkai.github.io/skyreels-a2.github.io/ 👉Repo github.com/SkyworkAI/SkyReels-A2 🤗Models https://huggingface.co/Skywork/SkyReels-A2