AI & ML Papers
前往频道在 Telegram
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers. Admin: @HusseinSheikho || @Hussein_Sheikho
显示更多📈 Telegram 频道 AI & ML Papers 的分析概览
频道 AI & ML Papers (@papernexus) 英语 语言赛道中的 是活跃参与者。目前社区聚集了 32 887 名订阅者,在 技术与应用 类别中位列第 4 158,并在 印度 地区排名第 12 536 位。
📊 受众指标与增长动态
自 невідомо 创建以来,项目保持高速增长,吸引了 32 887 名订阅者。
根据 23 六月, 2026 的最新数据,频道保持稳定运转。过去 30 天订阅人数变化为 407,过去 24 小时变化为 24,整体触达仍然可观。
- 认证状态: 未认证
- 互动率 (ER): 平均受众互动率为 1.18%。内容发布后 24 小时内通常能获得 0.81% 的反应,占订阅者总量。
- 帖子覆盖: 每篇帖子平均可获得 389 次浏览,首日通常累积 266 次浏览。
- 互动与反馈: 受众积极参与,单帖平均反应数为 1。
- 主题关注点: 内容集中在 summary, apr, huggingface, github, framework 等核心主题上。
📝 描述与内容策略
作者将该频道定位为表达主观观点的平台:
“Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.
Admin: @HusseinSheikho || @Hussein_Sheikho”
凭借高频更新(最新数据采集于 24 六月, 2026),频道始终保持新鲜度与高覆盖。分析显示受众积极互动,使其成为 技术与应用 类别中的关键影响点。
32 887
订阅者
+2424 小时
+367 天
+40730 天
帖子存档
32 888
New to LBank? Unlock VIP2 and Trading Rewards
VIP2 trial, transfer rewards, and trading bonuses for new users. Offer available until June 30.
Sponsored By WaybienAds
32 888
New to LBank? Unlock VIP2 and Trading Rewards
VIP2 trial, transfer rewards, and trading bonuses for new users. Offer available until June 30.
Sponsored By WaybienAds
32 888
Candy AI is now on Telegram!
Candy AI
Candy.ai is the best AI girlfriend app, letting you create personalized virtual companions or connect instantly with our realistic AI characters in immersive, uncensored fantasy experiences - all within a safe and private space.
Ad. 18+
32 888
🔥 AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction
💡 The paper introduces AOHP, an open source operating system framework built on the Android Open Source Project, designed to support AI agents as first class entities. The motivation behind AOHP is to address the limitations of existing operating systems, which are application centric and do not provide native support for AI agents, resulting in execution overhead and safety risks. AOHP treats agents as first class OS actors, enabling adaptive user interfaces and agent friendly runtime environments. The framework introduces three agent oriented system mechanisms: personalized service composition, efficient agent interfaces, and secure information flow. The authors evaluated AOHP through preliminary experiments on challenging tasks and found that it shows significant advantages in task completion rate, execution cost, and security policy compliance. Specifically, AOHP achieved a 21.12 percent increase in task completion rate and a 51.55 percent reduction in token cost. The paper contributes to the research community by providing an open testbed to explore the architectural primitives desired for agent mediated interaction, and demonstrates the potential of AOHP to enhance the efficiency, security, and adoption of AI agents in operating systems.📅 Published on Jun 22 🔗 Links: • GitHub: https://github.com/huggingface • Project Page: https://huggingface.co/papers?q=Android%20Open%20Source%20Project • arXiv: https://arxiv.org/abs/2606.23449 • PDF: https://arxiv.org/pdf/2606.23449 ━━━━━━━━━━━━━━━━━━━━━━━━ 📢 By: https://t.me/PaperNexus #OperatingSystemSecurity #AIIntegrationInOS #AgentOrientedProgramming #OpenSourceAndroid #ArtificialIntelligenceInOS
32 888
🔥 DiffusionBench: On Holistic Evaluation of Diffusion Transformers
💡 The paper introduces a unified framework called NanoGen for training and evaluating diffusion transformers, which are used in image generation tasks. The current evaluation setup for diffusion transformers is limited to class-conditional generation on ImageNet, which may not reflect real progress in generative modeling. The authors argue that text-to-image generation is a more comprehensive task, but it is often skipped due to perceived high costs and inconvenience. However, the authors show that with NanoGen, training and evaluating text-to-image models requires comparable compute to ImageNet. The NanoGen framework supports various diffusion methods and can be easily configured to train models on both ImageNet and text-to-image tasks. The authors trained 21 latent diffusion models using NanoGen and found that the ranking of methods on ImageNet and text-to-image tasks shows no strong correlation. This suggests that a method that improves performance on ImageNet may not necessarily improve performance on text-to-image generation. To address this issue, the authors propose a holistic benchmark called DiffusionBench, which summarizes results on both ImageNet and text-to-image tasks. The authors recommend reporting DiffusionBench in place of ImageNet alone, as methods that improve DiffusionBench are more likely to reflect broader progress in generative modeling. The main contribution of the paper is the introduction of NanoGen and DiffusionBench, which provide a more comprehensive evaluation setup for diffusion transformers and can help to advance research in generative modeling.📅 Published on Jun 23 🔗 Links: • GitHub: https://github.com/huggingface • arXiv: https://arxiv.org/abs/2606.24888 • PDF: https://arxiv.org/pdf/2606.24888 • Project Page: https://end2end-diffusion.github.io/diffusion-bench/ ━━━━━━━━━━━━━━━━━━━━━━━━ 📢 By: https://t.me/PaperNexus #DiffusionTransformers #ImageGenerationTasks #TextToImageGeneration #GenerativeModeling #DiffusionBasedArchitectures
32 888
🔥 NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?
💡 The paper introduces NatureBench, a benchmark of 90 scientific tasks derived from Nature publications, to assess the ability of AI coding agents to achieve discovery rather than just reproduction. The benchmark is built on NatureGym, an automated pipeline that constructs a standardized environment for each task, addressing the environment fragmentation problem that has limited the credibility of prior benchmarks. The authors evaluate ten frontier agent configurations under a strict protocol and find that the strongest model surpasses the state of the art on only 17.8 percent of tasks. The analysis reveals that agents succeed primarily through methodological translation, converting scientific tasks into familiar supervised prediction problems, rather than through genuine scientific innovation. The main reasons for failure are wrong method choice and insufficient compute budget, rather than task misunderstanding. The benchmark, pipeline, and a public leaderboard are released to facilitate further research. The paper contributes to the understanding of the limitations of current AI coding agents in achieving discovery and highlights the need for genuine scientific innovation in AI research.📅 Published on Jun 23 🔗 Links: • GitHub: https://github.com/huggingface • arXiv: https://arxiv.org/abs/2606.24530 • PDF: https://arxiv.org/pdf/2606.24530 • Project Page: https://frontisai.github.io/NatureBench/ 📊 Datasets citing this paper: • https://huggingface.co/datasets/FrontisAI/NatureBench ━━━━━━━━━━━━━━━━━━━━━━━━ 📢 By: https://t.me/PaperNexus #ArtificialIntelligenceInScience #NatureBench #AICodingAgents #ScientificDiscoveryWithAI #BenchmarkingAIAgents
32 888
🔥 MobileForge: Annotation-Free Adaptation for Mobile GUI Agents with Hierarchical Feedback-Guided Policy Optimization
💡 The paper introduces MobileForge, a system for adapting mobile graphical user interface agents to real target apps without requiring manual annotations. The problem addressed is that current mobile GUI agents require costly and time-consuming human-written tasks, demonstrations, or reward labels to adapt to new apps. Existing annotation-free GUI learning methods lack a unified approach to connect target-app exploration, curriculum mining, rollout execution, and feedback, and policy optimization often relies on isolated rollouts and coarse rewards. MobileForge consists of two main components: MobileGym, which generates tasks and evaluates rollouts based on real mobile app interaction, and Hierarchical Feedback-Guided Policy Optimization, which uses trajectory outcomes, step-level process feedback, and corrective hints to update the policy. This approach allows for efficient adaptation of mobile GUI agents to new apps without requiring manual annotations. The results show that MobileForge can adapt a mobile GUI agent to achieve 67.2 percent Pass@3 on AndroidWorld, which is close to the performance of a specialized model trained on closed data. Further adaptation using MobileForge reaches 77.6 percent Pass@3 on AndroidWorld and 41.0 percent success on the out-of-domain MobileWorld GUI-only split, establishing the strongest open-data mobile GUI agent in the evaluation. Overall, MobileForge provides a unified and efficient approach to adapting mobile GUI agents to new apps without requiring manual annotations, making it a significant contribution to the field.📅 Published on Jun 18 🔗 Links: • GitHub: https://github.com/huggingface • arXiv: https://arxiv.org/abs/2606.19930 • PDF: https://arxiv.org/pdf/2606.19930 • Project Page: https://mobile-forge.github.io 📊 Datasets citing this paper: • https://huggingface.co/datasets/lgy0404/mobileforge-exploration-trajectories • https://huggingface.co/datasets/lgy0404/mobileforge-training-data • https://huggingface.co/datasets/lgy0404/mobileforge-benchmark-results ━━━━━━━━━━━━━━━━━━━━━━━━ 📢 By: https://t.me/PaperNexus #MobileGUIAgents #HierarchicalFeedbackGuidedPolicyOptimization #AnnotationFreeLearning #MobileGraphicalUserInterface #PolicyOptimizationForMobileApps
32 888
🔥 MemGUI-Agent: An End-to-End Long-Horizon Mobile GUI Agent with Proactive Context Management
💡 The paper introduces MemGUI-Agent, a mobile GUI agent designed to address the limitations of existing agents on long-horizon tasks. Current agents struggle with retaining intermediate facts across many steps and app transitions, leading to unreliable performance. This limitation is attributed to the ReAct-style prompting approach, which passively accumulates per-step records, causing prompt explosion and dilution of critical cross-app facts. To address this issue, the authors propose MemGUI-Agent, which uses proactive context management through Context-as-Action, or ConAct. ConAct casts context management as first-class actions emitted by the same policy that selects UI actions. This approach maintains three structured context fields: folded action history, folded UI state, and recent step record, preserving critical UI facts while keeping context compact. The authors also introduce MemGUI-3K, a dataset with 2,956 trajectories and full ConAct annotations for supervised training and offline analysis. Training an 8B model on MemGUI-3K results in MemGUI-8B-SFT, an 8B MemGUI-Agent that achieves the best open-data 8B performance on MemGUI-Bench and generalizes to the out-of-distribution MobileWorld benchmark. The contributions of the paper are threefold. Firstly, it identifies the limitations of existing mobile GUI agents on long-horizon tasks and attributes them to the ReAct-style prompting approach. Secondly, it proposes MemGUI-Agent with proactive context management through ConAct, which addresses the limitations of existing agents. Finally, it introduces MemGUI-3K, a dataset for supervised training and offline analysis, and demonstrates the effectiveness of MemGUI-8B-SFT, an 8B MemGUI-Agent trained on this dataset. The code, data, and trained models will be released to facilitate further research and development.📅 Published on Jun 18 🔗 Links: • GitHub: https://github.com/huggingface • arXiv: https://arxiv.org/abs/2606.19926 • PDF: https://arxiv.org/pdf/2606.19926 • Project Page: https://memgui-agent.github.io/ 🤖 Models citing this paper: • https://huggingface.co/lgy0404/MemGUI-8B-SFT 📊 Datasets citing this paper: • https://huggingface.co/datasets/lgy0404/MemGUI-3K ━━━━━━━━━━━━━━━━━━━━━━━━ 📢 By: https://t.me/PaperNexus #MobileGUIAutomation #LongHorizonTaskLearning #ProactiveContextManagement #ContextAsAction #EndToEndGUIAgents
32 888
🔥 Lift4D: Harmonizing Single-View 3D Estimation for 4D Reconstruction In-the-Wild
💡 The paper presents Lift4D, a test-time optimization framework for reconstructing dynamic non-rigid objects from monocular video. The problem addressed is the difficulty in reconstructing 4D representations of dynamic objects from single-view video due to the scarcity of 4D training data and the limitations of prior approaches that either directly predict 4D representations or initialize a 3D representation and refine it based on video evidence. The method involves adapting a single-view 3D reconstruction model to yield temporally consistent per-frame predictions, which provides a coherent initialization for a deformable 3D Gaussian Splatting representation. This representation is then optimized to match the input video through an occlusion-aware optimization that recovers visible surface details and completes unobserved regions using a view-conditioned diffusion prior. The results show that Lift4D improves over prior 4D reconstruction methods, particularly on challenging in-the-wild sequences with severe occlusions and non-rigid motion. The framework effectively handles complex scenarios by integrating visual cues from direct observations with data-driven priors over geometry and appearance, making it a significant contribution to the field of 4D reconstruction from monocular video.📅 Published on Jun 22 🔗 Links: • GitHub: https://github.com/huggingface • arXiv: https://arxiv.org/abs/2606.23688 • PDF: https://arxiv.org/pdf/2606.23688 • Project Page: https://lift4d.github.io/ ━━━━━━━━━━━━━━━━━━━━━━━━ 📢 By: https://t.me/PaperNexus #4DReconstruction #SingleView3DEstimation #MonocularVideoAnalysis #DynamicNonRigidObjectReconstruction #TestTimeOptimizationFrameworks
32 888
New to LBank? Unlock VIP2 and Trading Rewards
VIP2 trial, transfer rewards, and trading bonuses for new users. Offer available until June 30.
Sponsored By WaybienAds
32 888
New to LBank? Unlock VIP2 and Trading Rewards
VIP2 trial, transfer rewards, and trading bonuses for new users. Offer available until June 30.
Sponsored By WaybienAds
32 888
🔥 OpenRath: Session-Centered Runtime State for Agent Systems
💡 The paper introduces OpenRath, a programming model for multi-agent systems that addresses the issue of fragmented runtime state. In current agent systems, various aspects such as transcripts, tool effects, and memory events are recorded separately, making it difficult to inspect or reproduce the system's behavior. OpenRath solves this problem by introducing a central runtime abstraction called Session, which is a first-class value that can be passed between agents and workflows. The Session abstraction is designed to be branchable, inspectable, replayable, backend-aware, and composable, allowing it to record various execution state information such as conversation chunks, sandbox placement, and tool evidence. This enables explicit fork, merge, and replay operations as runtime operations rather than reconstructing states from external traces. OpenRath also defines other key concepts such as Sandbox, Tool, Agent, Memory, Workflow, and Selector, which work together to provide a comprehensive programming model for multi-agent systems. The Selector is particularly important as it turns control flow into runtime-routed decisions. The paper presents the programming model, architecture, and evidence protocol of OpenRath, and claims that the Session abstraction provides agent systems with a first-class runtime value for auditable composition. The results of this work are limited to controlled runtime properties, and further evaluation is needed to compare the performance of OpenRath with other systems and to assess its availability and quality. Overall, OpenRath contributes a novel programming model for multi-agent systems that provides a unified and explicit way to manage runtime state, making it easier to inspect, reproduce, and debug the behavior of these systems.📅 Published on Jun 17 🔗 Links: • GitHub: https://github.com/huggingface • arXiv: https://arxiv.org/abs/2606.19409 • PDF: https://arxiv.org/pdf/2606.19409 • Project Page: https://docs.openrath.com ━━━━━━━━━━━━━━━━━━━━━━━━ 📢 By: https://t.me/PaperNexus #MultiAgentSystems #RuntimeStateManagement #AgentOrientedProgramming #SessionCenteredArchitecture #DistributedSystemDesign
32 888
🔥 Efficient Guided Generation for Large Language Models
💡 The paper presents an efficient method for guiding large language model text generation using regular expressions and context-free grammars. The problem addressed is that guided generation can be impractical due to significant overhead. The authors propose an approach that adds minimal overhead to the token sequence generation process. This method makes guided generation feasible in practice. The approach is implemented in the open source Python library Outlines, providing a practical solution for efficient guided generation. The results indicate that the method is effective, allowing for guided generation with little to no overhead, which is a significant contribution to the field of natural language processing.📅 Published on Jul 19, 2023 🔗 Links: • GitHub: https://github.com/huggingface • arXiv: https://arxiv.org/abs/2307.09702 • PDF: https://arxiv.org/pdf/2307.09702 ━━━━━━━━━━━━━━━━━━━━━━━━ 📢 By: https://t.me/PaperNexus #LargeLanguageModels #GuidedTextGeneration #RegularExpressions #ContextFreeGrammars #EfficientGenerationMethods
现已上线!2025 年 Telegram 研究 — 年度关键洞察 
