Data science research papers
Show more
1 499
Subscribers
+524 hours
+287 days
+10030 days
- Subscribers
- Post coverage
- ER - engagement ratio
Data loading in progress...
Subscriber growth rate
Data loading in progress...
Photo unavailableShow in Telegram
SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals
Publication date: 28 May 2024
Topic: Contrastive Learning
Paper: https://arxiv.org/pdf/2405.17766v1.pdf
GitHub: https://github.com/rthapa84/sleepfm-codebase
Description:
We show that a novel leave-one-out approach for contrastive learning significantly improves downstream task performance compared to representations from standard pairwise contrastive learning. A logistic regression model trained on SleepFM's learned embeddings outperforms an end-to-end trained convolutional neural network (CNN) on sleep stage classification (macro AUROC 0.88 vs 0.72 and macro AUPRC 0.72 vs 0.48) and sleep disordered breathing detection (AUROC 0.85 vs 0.69 and AUPRC 0.77 vs 0.61). Notably, the learned embeddings achieve 48% top-1 average accuracy in retrieving the corresponding recording clips of other modalities from 90,000 candidates.
Photo unavailableShow in Telegram
Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis
Publication date: 5 June 2024
Topic: Representation Learning
Paper: https://arxiv.org/pdf/2406.03430v1.pdf
GitHub: https://github.com/xmindflow/awesome_mamba
Description:
Capitalizing on the advances in computer vision, medical imaging has heralded a new epoch with Mamba models. Intending to help researchers navigate the surge, this survey seeks to offer an encyclopedic review of Mamba models in medical imaging. Specifically, we start with a comprehensive theoretical review forming the basis of SSMs, including Mamba architecture and its alternatives for sequence modeling paradigms in this context. Next, we offer a structured classification of Mamba models in the medical field and introduce a diverse categorization scheme based on their application, imaging modalities, and targeted organs.
Photo unavailableShow in Telegram
Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation
Publication date: 4 June 2024
Topic: Object detection
Paper: https://arxiv.org/pdf/2406.02548v1.pdf
GitHub: https://github.com/aminebdj/openyolo3d
Description:
We address this task by generating class-agnostic 3D masks for objects in the scene and associating them with text prompts. We observe that the projection of class-agnostic 3D point cloud instances already holds instance information; thus, using SAM might only result in redundancy that unnecessarily increases the inference time. We empirically find that a better performance of matching text prompts to 3D masks can be achieved in a faster fashion with a 2D object detector. We validate our Open-YOLO 3D on two benchmarks, ScanNet200 and Replica, under two scenarios: (i) with ground truth masks, where labels are required for given object proposals, and (ii) with class-agnostic 3D proposals generated from a 3D proposal network.
π 1
Photo unavailableShow in Telegram
Parameter-Inverted Image Pyramid Networks
Publication date: 6 June 2024
Topic: Image Classification
Paper: https://arxiv.org/pdf/2406.04330v1.pdf
GitHub: https://github.com/opengvlab/piip
Description:
We propose a feature interaction mechanism to allow features of different resolutions to complement each other and effectively integrate information from different spatial scales. Extensive experiments demonstrate that the PIIP achieves superior performance in tasks such as object detection, segmentation, and image classification, compared to traditional image pyramid methods and single-branch networks, while reducing computational cost. Notably, when applying our method on a large-scale vision foundation model InternViT-6B, we improve its performance by 1%-2% on detection and segmentation with only 40%-60% of the original computation. These results validate the effectiveness of the PIIP approach and provide a new technical direction for future vision computing tasks.
Photo unavailableShow in Telegram
Matching Anything by Segmenting Anything
Publication date: 6 June 2024
Topic: Semantic Segmentation
Paper: https://arxiv.org/pdf/2406.04221v1.pdf
GitHub: https://github.com/siyuanliii/masa
Description:
We propose MASA, a novel method for robust instance association learning, capable of matching any objects within videos across diverse domains without tracking labels. Leveraging the rich object segmentation from the Segment Anything Model (SAM), MASA learns instance-level correspondence through exhaustive data transformations. We treat the SAM outputs as dense object region proposals and learn to match those regions from a vast image collection. We further design a universal MASA adapter which can work in tandem with foundational segmentation or detection models and enable them to track any detected objects. Those combinations present strong zero-shot tracking ability in complex domains.
π 1
Photo unavailableShow in Telegram
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation
Publication date: 3 Apr 2024
Topic: Image Generation
Paper: https://arxiv.org/pdf/2404.02733v2.pdf
GitHub: https://github.com/instantstyle/instantstyle
Description:
In this paper, we commence by examining several compelling yet frequently overlooked observations. We then proceed to introduce InstantStyle, a framework designed to address these issues through the implementation of two key strategies: 1) A straightforward mechanism that decouples style and content from reference images within the feature space, predicated on the assumption that features within the same space can be either added to or subtracted from one another. 2) The injection of reference image features exclusively into style-specific blocks, thereby preventing style leaks and eschewing the need for cumbersome weight tuning, which often characterizes more parameter-heavy designs.
π 1
Photo unavailableShow in Telegram
Language Guided Domain Generalized Medical Image Segmentation
Publication date: 1 April 2024
Topic: Contrastive Learning
Paper: https://arxiv.org/pdf/2404.01272v2.pdf
GitHub: https://github.com/shahinakk/lg_sdg
Description:
In this paper, we propose an approach that explicitly leverages textual information by incorporating a contrastive learning mechanism guided by the text encoder features to learn a more robust feature representation. We assess the effectiveness of our text-guided contrastive feature alignment technique in various scenarios, including cross-modality, cross-sequence, and cross-site settings for different segmentation tasks. Our approach achieves favorable performance against existing methods in literature.
π 1
Photo unavailableShow in Telegram
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
Publication date: 26 Mar 2024
Topic: Object detection
Paper: https://arxiv.org/pdf/2403.17695v1.pdf
GitHub: https://github.com/chenhongyiyang/plainmamba
Description:
In this paper, we further adapt the selective scanning process of Mamba to the visual domain, enhancing its ability to learn features from two-dimensional images by (i) a continuous 2D scanning process that improves spatial continuity by ensuring adjacency of tokens in the scanning sequence, and (ii) direction-aware updating which enables the model to discern the spatial relations of tokens by encoding directional information. Our architecture is designed to be easy to use and easy to scale, formed by stacking identical PlainMamba blocks, resulting in a model with constant width throughout all layers.
Photo unavailableShow in Telegram
Targeted Visualization of the Backbone of Encoder LLMs
Publication date: 26 Mar 2024
Topic: Image Classification
Paper: https://arxiv.org/pdf/2403.18872v1.pdf
GitHub: https://github.com/LucaHermes/DeepView
Description:
We investigate the application of DeepView, a method for visualizing a part of the decision function together with a data set in two dimensions, to the NLP domain. While in previous work, DeepView has been used to inspect deep image classification models, we demonstrate how to apply it to BERT-based NLP classifiers and investigate its usability in this domain, including settings with adversarially perturbed input samples and pre-trained, fine-tuned, and multi-task models.
Photo unavailableShow in Telegram
Samba: Semantic Segmentation of Remotely Sensed Images with State Space Model
Publication date: 2 April 2024
Topic: Semantic Segmentation
Paper: https://arxiv.org/pdf/2404.01705v1.pdf
GitHub: https://github.com/zhuqinfeng1999/samba
Description:
Inspired by Mamba, which adopts a State Space Model (SSM) to efficiently capture global semantic information, we propose a semantic segmentation framework for high-resolution remotely sensed images, named Samba. Samba utilizes an encoder-decoder architecture, with Samba blocks serving as the encoder for efficient multi-level semantic information extraction, and UperNet functioning as the decoder. We evaluate Samba on the LoveDA dataset, comparing its performance against top-performing CNN and ViT methods. The results reveal that Samba achieved unparalleled performance on LoveDA.
Choose a Different Plan
Your current plan allows analytics for only 5 channels. To get more, please choose a different plan.