Data science research papers

@data_science_research_papers

Malasia29 028Inglés139 562Educación61 533

Publicaciones publicitarias

1 510

Suscriptores

+324 horas

+237 días

+9530 días

368

Visitas de la publicación

~ 10224 horas

~ 12648 horas

24.39%

Tasa de compromiso

6.8%24 horas

8.3%48 horas

Menciones

Sin datos7 días

Sin datos30 días

Sin datos

Mensajes por día

~ 1

Reacciones

~ 4

Comentarios

~ 1

Republicar

Suscriptores
Cobertura postal
ER - ratio de compromiso

Carga de datos en curso...

Photo unavailableShow in Telegram

Seg-LSTM: Performance of xLSTM for Semantic Segmentation of Remotely Sensed Images Publication date: 20 June 2024 Topic: Semantic Segmentation Paper: https://arxiv.org/pdf/2406.14086v1 GitHub: https://github.com/zhuqinfeng1999/seg-lstm Description: Our study represents the first attempt to evaluate the effectiveness of Vision-LSTM in the semantic segmentation of remotely sensed images. This evaluation is based on a specifically designed encoder-decoder architecture named Seg-LSTM, and comparisons with state-of-the-art segmentation networks. Our study found that Vision-LSTM's performance in semantic segmentation was limited and generally inferior to Vision-Transformers-based and Vision-Mamba-based models in most comparative tests.

Mostrar todo...

Photo unavailableShow in Telegram

SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals Publication date: 28 May 2024 Topic: Contrastive Learning Paper: https://arxiv.org/pdf/2405.17766v1.pdf GitHub: https://github.com/rthapa84/sleepfm-codebase Description: We show that a novel leave-one-out approach for contrastive learning significantly improves downstream task performance compared to representations from standard pairwise contrastive learning. A logistic regression model trained on SleepFM's learned embeddings outperforms an end-to-end trained convolutional neural network (CNN) on sleep stage classification (macro AUROC 0.88 vs 0.72 and macro AUPRC 0.72 vs 0.48) and sleep disordered breathing detection (AUROC 0.85 vs 0.69 and AUPRC 0.77 vs 0.61). Notably, the learned embeddings achieve 48% top-1 average accuracy in retrieving the corresponding recording clips of other modalities from 90,000 candidates.

Mostrar todo...

Photo unavailableShow in Telegram

Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis Publication date: 5 June 2024 Topic: Representation Learning Paper: https://arxiv.org/pdf/2406.03430v1.pdf GitHub: https://github.com/xmindflow/awesome_mamba Description: Capitalizing on the advances in computer vision, medical imaging has heralded a new epoch with Mamba models. Intending to help researchers navigate the surge, this survey seeks to offer an encyclopedic review of Mamba models in medical imaging. Specifically, we start with a comprehensive theoretical review forming the basis of SSMs, including Mamba architecture and its alternatives for sequence modeling paradigms in this context. Next, we offer a structured classification of Mamba models in the medical field and introduce a diverse categorization scheme based on their application, imaging modalities, and targeted organs.

Mostrar todo...

Photo unavailableShow in Telegram

Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation Publication date: 4 June 2024 Topic: Object detection Paper: https://arxiv.org/pdf/2406.02548v1.pdf GitHub: https://github.com/aminebdj/openyolo3d Description: We address this task by generating class-agnostic 3D masks for objects in the scene and associating them with text prompts. We observe that the projection of class-agnostic 3D point cloud instances already holds instance information; thus, using SAM might only result in redundancy that unnecessarily increases the inference time. We empirically find that a better performance of matching text prompts to 3D masks can be achieved in a faster fashion with a 2D object detector. We validate our Open-YOLO 3D on two benchmarks, ScanNet200 and Replica, under two scenarios: (i) with ground truth masks, where labels are required for given object proposals, and (ii) with class-agnostic 3D proposals generated from a 3D proposal network.

Mostrar todo...

👍 1

Photo unavailableShow in Telegram

Parameter-Inverted Image Pyramid Networks Publication date: 6 June 2024 Topic: Image Classification Paper: https://arxiv.org/pdf/2406.04330v1.pdf GitHub: https://github.com/opengvlab/piip Description: We propose a feature interaction mechanism to allow features of different resolutions to complement each other and effectively integrate information from different spatial scales. Extensive experiments demonstrate that the PIIP achieves superior performance in tasks such as object detection, segmentation, and image classification, compared to traditional image pyramid methods and single-branch networks, while reducing computational cost. Notably, when applying our method on a large-scale vision foundation model InternViT-6B, we improve its performance by 1%-2% on detection and segmentation with only 40%-60% of the original computation. These results validate the effectiveness of the PIIP approach and provide a new technical direction for future vision computing tasks.

Mostrar todo...

Photo unavailableShow in Telegram

Matching Anything by Segmenting Anything Publication date: 6 June 2024 Topic: Semantic Segmentation Paper: https://arxiv.org/pdf/2406.04221v1.pdf GitHub: https://github.com/siyuanliii/masa Description: We propose MASA, a novel method for robust instance association learning, capable of matching any objects within videos across diverse domains without tracking labels. Leveraging the rich object segmentation from the Segment Anything Model (SAM), MASA learns instance-level correspondence through exhaustive data transformations. We treat the SAM outputs as dense object region proposals and learn to match those regions from a vast image collection. We further design a universal MASA adapter which can work in tandem with foundational segmentation or detection models and enable them to track any detected objects. Those combinations present strong zero-shot tracking ability in complex domains.

Mostrar todo...

👍 1

Photo unavailableShow in Telegram

InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation Publication date: 3 Apr 2024 Topic: Image Generation Paper: https://arxiv.org/pdf/2404.02733v2.pdf GitHub: https://github.com/instantstyle/instantstyle Description: In this paper, we commence by examining several compelling yet frequently overlooked observations. We then proceed to introduce InstantStyle, a framework designed to address these issues through the implementation of two key strategies: 1) A straightforward mechanism that decouples style and content from reference images within the feature space, predicated on the assumption that features within the same space can be either added to or subtracted from one another. 2) The injection of reference image features exclusively into style-specific blocks, thereby preventing style leaks and eschewing the need for cumbersome weight tuning, which often characterizes more parameter-heavy designs.

Mostrar todo...

👍 1

Photo unavailableShow in Telegram

Language Guided Domain Generalized Medical Image Segmentation Publication date: 1 April 2024 Topic: Contrastive Learning Paper: https://arxiv.org/pdf/2404.01272v2.pdf GitHub: https://github.com/shahinakk/lg_sdg Description: In this paper, we propose an approach that explicitly leverages textual information by incorporating a contrastive learning mechanism guided by the text encoder features to learn a more robust feature representation. We assess the effectiveness of our text-guided contrastive feature alignment technique in various scenarios, including cross-modality, cross-sequence, and cross-site settings for different segmentation tasks. Our approach achieves favorable performance against existing methods in literature.

Mostrar todo...

👍 1

Photo unavailableShow in Telegram

PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition Publication date: 26 Mar 2024 Topic: Object detection Paper: https://arxiv.org/pdf/2403.17695v1.pdf GitHub: https://github.com/chenhongyiyang/plainmamba Description: In this paper, we further adapt the selective scanning process of Mamba to the visual domain, enhancing its ability to learn features from two-dimensional images by (i) a continuous 2D scanning process that improves spatial continuity by ensuring adjacency of tokens in the scanning sequence, and (ii) direction-aware updating which enables the model to discern the spatial relations of tokens by encoding directional information. Our architecture is designed to be easy to use and easy to scale, formed by stacking identical PlainMamba blocks, resulting in a model with constant width throughout all layers.

Mostrar todo...

Photo unavailableShow in Telegram

Targeted Visualization of the Backbone of Encoder LLMs Publication date: 26 Mar 2024 Topic: Image Classification Paper: https://arxiv.org/pdf/2403.18872v1.pdf GitHub: https://github.com/LucaHermes/DeepView Description: We investigate the application of DeepView, a method for visualizing a part of the decision function together with a data set in two dimensions, to the NLP domain. While in previous work, DeepView has been used to inspect deep image classification models, we demonstrate how to apply it to BERT-based NLP classifiers and investigate its usability in this domain, including settings with adversarially perturbed input samples and pre-trained, fine-tuned, and multi-task models.

Mostrar todo...

Elige un Plan Diferente

Tu plan actual sólo permite el análisis de 5 canales. Para obtener más, elige otro plan.