AI with Papers - Artificial Intelligence & Deep Learning

前往频道在 Telegram

All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/ #AI #chatGPT

显示更多

马来西亚2 240 技术与应用7 726...

📈 Telegram 频道 AI with Papers - Artificial Intelligence & Deep Learning 的分析概览

频道 AI with Papers - Artificial Intelligence & Deep Learning (@ai_deeplearning) 英语语言赛道中的是活跃参与者。目前社区聚集了 17 151 名订阅者，在 技术与应用 类别中位列第 7 726，并在 马来西亚 地区排名第 2 240 位。

📊 受众指标与增长动态

自 невідомо 创建以来，项目保持高速增长，吸引了 17 151 名订阅者。

根据 21 六月, 2026 的最新数据，频道保持稳定运转。过去 30 天订阅人数变化为 -166，过去 24 小时变化为 -6，整体触达仍然可观。

认证状态： 未认证
互动率 (ER)： 平均受众互动率为 23.63%。内容发布后 24 小时内通常能获得 6.86% 的反应，占订阅者总量。
帖子覆盖： 每篇帖子平均可获得 4 057 次浏览，首日通常累积 1 177 次浏览。
互动与反馈： 受众积极参与，单帖平均反应数为 26。
主题关注点： 内容集中在 framework, object, dataset, tba, depth 等核心主题上。

📝 描述与内容策略

作者将该频道定位为表达主观观点的平台：
“All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/ #AI #chatGPT”

凭借高频更新（最新数据采集于 22 六月, 2026），频道始终保持新鲜度与高覆盖。分析显示受众积极互动，使其成为 技术与应用 类别中的关键影响点。

17 151

订阅者

-624 小时

-277 天

-16630 天

4 057

帖子浏览量

~ 1 17724 小时

~ 1 35348 小时

23.63%

参与率

无数据

每日帖子数

Ads index

beta

帖子存档

17 151

💦 ObjectDrop: automagical objects removal 💦 👉#Google unveils ObjectDrop, the new SOTA in photorealistic object removal and insertion. Focus on shadows and reflections, impressive! 👉Review https://t.ly/ZJ6NN 👉Paper https://arxiv.org/pdf/2403.18818.pdf 👉Project https://objectdrop.github.io/

17 151

🏀 MAVOS Object Segmentation 🏀 👉MAVOS is a transformer-based VOS that introduces a novel, optimized and dynamic long-term modulated cross-attention memory. Code & Models announced (coming soon under BSD 3-Clause)💙 👉Review https://t.ly/SKaRG 👉Paper https://lnkd.in/dQyifKa3 👉Project github.com/Amshaker/MAVOS 👉Code/Demo (announced)

17 151

☔ AiOS: All-in-One-Stage Humans ☔ 👉All-in-one-stage framework for SOTA multiple expressive pose and shape recovery without additional human detection step. 👉Review https://t.ly/ekNd4 👉Paper https://arxiv.org/pdf/2403.17934.pdf 👉Project https://ttxskk.github.io/AiOS/ 👉Code/Demo (announced)

17 151

💄TinyBeauty: 460 FPS Diffusion Make-up💄 👉TinyBeauty: only 80K parameters to achieve the SOTA in virtual makeup without intricate face prompts. Up to 460 FPS on mobile! 👉Review https://t.ly/LG5ok 👉Paper https://arxiv.org/pdf/2403.15033.pdf 👉Project https://tinybeauty.github.io/TinyBeauty/

17 151

💄💄TinyBeauty: 460 FPS Diffusion Make-up💄💄 👉TinyBeauty;:necessitates merely 80K parameters to achieve the SOTA in virtual makeup without intricate face prompts. Up to 460 FPS on mobile! Authors: Jiao Tong University, Alibaba, USC-SJTU. 𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬: ✅DAL, Data Amplify Learning: novel learning framework ✅Diffusion-based Data Amplifier for better training ✅Only 80K parameters to achieve the previous SOTA ✅Insane inference speed (460 fps) on iPhone 13 ✅Highly competitive using only FIVE image pairs #artificialintelligence #machinelearning #ml #AI #deeplearning #computervision #AIwithPapers #metaverse 👉Discussion https://lnkd.in/dMgakzWm 👉Paper https://arxiv.org/pdf/2403.15033.pdf 👉Project https://tinybeauty.github.io/TinyBeauty/

17 151

🦖 T-Rex 2: a new SOTA is out! 🦖 👉A novel (VERY STRONG) open-set object detector model. Strong zero-shot capabilities, suitable for various scenarios with only one suit of weights. Demo and Source Code released💙 👉Review https://t.ly/fYw8D 👉Paper https://lnkd.in/dpmRh2zh 👉Project https://lnkd.in/dnR_jPcR 👉Code https://lnkd.in/dnZnGRUn 👉Demo https://lnkd.in/drDUEDYh

17 151

🦕 DINO-based Video Tracking 🦕 👉The Weizmann Institute announced the new SOTA in point-tracking via pre-trained DINO features. Source code announced (not yet released)💙 👉Review https://t.ly/_GIMT 👉Paper https://lnkd.in/dsGVDcar 👉Project dino-tracker.github.io/ 👉Code (announced)

17 151

🪼FaceXFormer: Unified Face-Transformer🪼 👉FaceXFormer, the first unified transformer for facial analysis: face parsing, landmark detection, head pose, attributes recognition, age, gender, race, and landmarks. 👉Review https://t.ly/MfAFI 👉Paper https://arxiv.org/pdf/2403.12960.pdf 👉Project kartik-3004.github.io/facexformer_web/ 👉Code github.com/Kartik-3004/facexformer

17 151

🏷️ Face Foundation Model 🏷️ 👉Arc2Face, the first foundation model for human faces. Large dataset of high-resolution faces with consistent ID / intra-class variability, and an ID-conditioned face model trained on it. Source Code released 💙 👉Review https://t.ly/MfAFI 👉Paper https://lnkd.in/dViE_tCd 👉Project https://lnkd.in/d4MHdEZK 👉Code https://lnkd.in/dv9ZtDfA

17 151

🏷️🏷️Arc2Face: Face Foundation Model🏷️🏷️ 👉Arc2Face, the first foundation model for human faces. Large dataset of high-resolution faces with consistent ID / intra-class variability, and an ID-conditioned face model trained on it. Source Code released 💙 #artificialintelligence #machinelearning #ml #AI #deeplearning #computervision #AIwithPapers #metaverse 👉Discussion https://lnkd.in/dMgakzWm 👉Paper https://lnkd.in/dViE_tCd 👉Project https://lnkd.in/d4MHdEZK 👉Code https://lnkd.in/dv9ZtDfA

17 151

🪖RT Humanoid from Head-Mounted Sensors🪖 👉#META (+CMU) announced SimXR, a method for controlling a simulated avatar from info obtained from AR/VR headsets 👉Review https://t.ly/Si2Mp 👉Paper arxiv.org/pdf/2403.06862.pdf 👉Project www.zhengyiluo.com/SimXR/

17 151

👺 Can GPT-4 play DOOM? 👺 👉Apparently yes, GPT-4 can play the game to a passable degree: it is able to manipulate doors, combat enemies, and perform pathing. Code (with licensing restrictions) released 👉Review https://t.ly/W8-0F 👉Paper https://lnkd.in/dmsB7bjA 👉Project https://lnkd.in/ddDPwjQB

17 151

🏛️ PIXART-Σ: 4K Generation 🏛️ 👉PixArt-Σ is a novel Diffusion Transformer model (DiT) capable of directly generating images at 4K resolution. Authors: #Huawei, Dalian, HKU & HKUST. Demos available, code announced 💙 👉Review https://t.ly/Cm2Qh 👉Paper arxiv.org/pdf/2403.04692.pdf 👉Project pixart-alpha.github.io/PixArt-sigma-project/ 👉Repo (empty) github.com/PixArt-alpha/PixArt-sigma 🤗-Demo https://huggingface.co/spaces/PixArt-alpha/PixArt-alpha

17 151

🦁StableDrag: Point-based Editing🦁 👉#Tencent unveils StableDrag, a novel point-based image editing framework via discriminative point tracking method + confidence-based latent enhancement strategy for motion supervision. Source Code announced but still no repo. 👉Review https://t.ly/eUI05 👉Paper https://lnkd.in/dz8-ymck 👉Project stabledrag.github.io/

17 151

🧵E-LoFTR: new Feats-Matching SOTA🧵 👉A novel LoFTR-inspired algorithm for efficiently producing semidense matches across images: up to 2.5× faster than LoFTR, superior to previous SOTA pipeline (SuperPoint + LightGlue). Code announced. 👉Review https://t.ly/7SPmC 👉Paper https://arxiv.org/pdf/2403.04765.pdf 👉Project https://zju3dv.github.io/efficientloftr/ 👉Repo https://github.com/zju3dv/efficientloftr

17 151

🔥 SOTA: Stable Diffusion 3 is out! 🔥 👉Stable Diffusion 3 is the new SOTA in text-to-image generation (based on human preference evaluations). New Multimodal Diffusion Transformer (MMDiT) architecture uses separate sets of weights for image & language, improving text understanding/spelling capabilities. Weights & Source Code released 💙 👉Review https://t.ly/a1koo 👉Paper https://lnkd.in/d4i-9Bte 👉Blog https://lnkd.in/d-bEX-ww

17 151

💥 MM-AU: Accident Understanding 💥 👉MM-AU - Multi-Modal Accident Video Understanding: 11,727 videos with temporally aligned text descriptions. 2.23M+ BBs and 58,650 pairs of video-based accident reasons. Dataset & Code released 💙 👉Review https://t.ly/a-jKI 👉Paper https://arxiv.org/pdf/2403.00436.pdf 👉Dataset http://www.lotvsmmau.net/MMAU/demo

17 151

💌 Multi-LoRA Composition 💌 👉Two novel training-free image composition: LoRA Switch and LoRA Composite for integrating any number of elements in an image through multi-LoRA composition. Source Code released 💙 👉Review https://t.ly/GFy3Z 👉Paper arxiv.org/pdf/2402.16843.pdf 👉Code github.com/maszhongming/Multi-LoRA-Composition

17 151

🎷EMO: talking/singing Gen-AI 🎷 👉#Alibaba announced EMO: audio-driven portrait-video generation. Vocal avatar videos with expressive facial expressions, and various head poses. Input: 1 single frame, video duration according to the length of input audio 👉Review https://t.ly/4IYj5 👉Paper https://lnkd.in/dGPX2-Yc 👉Project https://lnkd.in/dyf6p_N3 👉Repo (empty) github.com/HumanAIGC/EMO

17 151

🎷EMO: talking/singing Gen-AI 🎷 👉#Alibaba announced EMO: audio-driven portrait-video generation. Vocal avatar videos with expressive facial expressions, and various head poses. Input: 1 single frame, video duration according to the length of input audio 👉Review 👉Paper https://lnkd.in/dGPX2-Yc 👉Project https://lnkd.in/dyf6p_N3 👉Repo (empty) github.com/HumanAIGC/EMO