Data Scientology
الذهاب إلى القناة على Telegram
Hot data science related posts every hour. Chat: https://telegram.me/r_channels Contacts: @lgyanf
إظهار المزيد1 144
المشتركون
لا توجد بيانات24 ساعات
-57 أيام
-930 أيام
جاري تحميل البيانات...
القنوات المماثلة
سحابة العلامات
الإشارات الواردة والصادرة
---
---
---
---
---
---
جذب المشتركين
يوليو '26
يوليو '260
في 0 قنوات
يونيو '26
+6
في 0 قنوات
Get PRO
مايو '26
+4
في 0 قنوات
Get PRO
أبريل '26
+30
في 0 قنوات
Get PRO
مارس '26
+40
في 0 قنوات
Get PRO
فبراير '26
+36
في 0 قنوات
Get PRO
يناير '26
+13
في 0 قنوات
Get PRO
ديسمبر '25
+10
في 0 قنوات
Get PRO
نوفمبر '25
+7
في 0 قنوات
Get PRO
أكتوبر '25
+4
في 0 قنوات
Get PRO
سبتمبر '25
+3
في 0 قنوات
Get PRO
أغسطس '25
+8
في 0 قنوات
Get PRO
يوليو '25
+10
في 0 قنوات
Get PRO
يونيو '25
+10
في 0 قنوات
Get PRO
مايو '25
+9
في 0 قنوات
Get PRO
أبريل '25
+3
في 0 قنوات
Get PRO
مارس '25
+7
في 0 قنوات
Get PRO
فبراير '25
+3
في 0 قنوات
Get PRO
يناير '25
+4
في 0 قنوات
Get PRO
ديسمبر '24
+6
في 0 قنوات
Get PRO
نوفمبر '24
+2
في 0 قنوات
Get PRO
أكتوبر '24
+8
في 0 قنوات
Get PRO
سبتمبر '24
+8
في 0 قنوات
Get PRO
أغسطس '24
+11
في 0 قنوات
Get PRO
يوليو '24
+13
في 0 قنوات
Get PRO
يونيو '24
+11
في 0 قنوات
Get PRO
مايو '24
+10
في 0 قنوات
Get PRO
أبريل '24
+12
في 0 قنوات
Get PRO
مارس '24
+20
في 0 قنوات
Get PRO
فبراير '24
+19
في 0 قنوات
Get PRO
يناير '24
+19
في 0 قنوات
Get PRO
ديسمبر '23
+17
في 6 قنوات
Get PRO
نوفمبر '23
+23
في 3 قنوات
Get PRO
أكتوبر '23
+27
في 3 قنوات
Get PRO
سبتمبر '23
+21
في 0 قنوات
Get PRO
أغسطس '23
+9
في 0 قنوات
Get PRO
يوليو '23
+18
في 0 قنوات
Get PRO
يونيو '23
+20
في 0 قنوات
Get PRO
مايو '23
+26
في 0 قنوات
Get PRO
أبريل '23
+53
في 0 قنوات
Get PRO
مارس '23
+15
في 0 قنوات
Get PRO
فبراير '23
+12
في 0 قنوات
Get PRO
يناير '23
+18
في 0 قنوات
Get PRO
ديسمبر '22
+31
في 0 قنوات
Get PRO
نوفمبر '22
+14
في 0 قنوات
Get PRO
أكتوبر '22
+47
في 0 قنوات
Get PRO
سبتمبر '22
+45
في 0 قنوات
Get PRO
أغسطس '22
+18
في 0 قنوات
Get PRO
يوليو '22
+18
في 0 قنوات
Get PRO
يونيو '22
+18
في 0 قنوات
Get PRO
مايو '22
+27
في 0 قنوات
Get PRO
أبريل '22
+43
في 0 قنوات
Get PRO
مارس '22
+47
في 0 قنوات
Get PRO
فبراير '22
+89
في 0 قنوات
Get PRO
يناير '22
+57
في 0 قنوات
Get PRO
ديسمبر '21
+36
في 0 قنوات
Get PRO
نوفمبر '21
+27
في 0 قنوات
Get PRO
أكتوبر '21
+43
في 0 قنوات
Get PRO
سبتمبر '21
+64
في 0 قنوات
Get PRO
أغسطس '21
+55
في 0 قنوات
Get PRO
يوليو '21
+28
في 0 قنوات
Get PRO
يونيو '21
+26
في 0 قنوات
Get PRO
مايو '21
+29
في 0 قنوات
Get PRO
أبريل '21
+26
في 0 قنوات
Get PRO
مارس '21
+49
في 0 قنوات
Get PRO
فبراير '21
+21
في 0 قنوات
Get PRO
يناير '21
+45
في 0 قنوات
Get PRO
ديسمبر '20
+882
في 0 قنوات
| التاريخ | نمو المشتركين | الإشارات | القنوات | |
| 03 يوليو | 0 | |||
| 02 يوليو | 0 | |||
| 01 يوليو | 0 |
منشورات القناة
Books/Resources to improve mathematical foundations for ML research D
I am a mid to late stage PhD student in ML. I've known this before, but only recently I started feeling this urgently: my mathematical foundations are shaky, because I kept "learning-things-as-I-go" when working on various problems. I likely have only a year or two left until I graduate, and before I do so, I want to really dedicate some time and focus to brush up on the fundamentals.
Primarily, I want to improve my knowledge in Linear Algebra, Probability Theory, and Functional Analysis.
For Lin. alg., I am looking at "Linear Algebra done right", and I think this book is sufficient for the topic, unless anyone thinks otherwise.
I am not sure where to start for probability, as well as functional analysis. Rudin's books give me headaches. I instead started reading "A primer on RKHS" (https://arxiv.org/abs/1408.0952) to "dip my toe" into functional analysis.
Apart from the above, I might re-read PRML book (I've only read specific chapters before), and try to finish Pat Kidger's Just-Know-Stuff list (https://kidger.site/thoughts/just-know-stuff).
Thoughts? Anyone have any book/resource recommendations? Someone told me to look into "the bright side of mathematics" on YouTube, anyone ever go through the videos there?
I'm aware finding good, digestible resources is less than 10% of the challenge. The difficult part is sticking through and actually reading/working through these topics, while still juggling other academic responsibilities.
https://redd.it/1ulmy9g
@datascientology
| 2 | WIP: Currently building an app to teach (French) sign language using computer vision
https://redd.it/1uk1lk7
@datascientology | 40 |
| 3 | A physical, working LeNet-1 (1989) built from transparent PCBs, glass and aluminium.
https://redd.it/1uhr1g1
@datascientology | 59 |
| 4 | لا يوجد نص... | 93 |
| 5 | ShadeNet 28M — Dual-mode PBR material estimation from any RGB image
https://redd.it/1ufmhd4
@datascientology | 77 |
| 6 | DeepSWE: new benchmark looking at how well today's frontier models can actually write code R
DeepSWE delivers four advances over existing public benchmarks:
Contamination free: Tasks are written from scratch, not adapted from existing commits or PRs, so no model has seen the solution during pretraining.
High diversity: Tasks span a broad pool of 91 repositories across 5 languages.
Real-world complexity: Prompts are \~half the length of SWE-bench Pro's, yet solutions require 5.5x more code and \~2x more output tokens.
Reliable verification: Verifiers are hand-written to test software behavior rather than implementation details.
The result is a benchmark that reflects how today's frontier coding agents actually perform in software engineering work.
https://preview.redd.it/lacvagyr159h1.png?width=1373&format=png&auto=webp&s=6514340a15d51d7f03da733f08fb3f6a302cac75
It's open-source: https://github.com/datacurve-ai/deep-swe
https://redd.it/1ue0hlp
@datascientology | 92 |
| 7 | I've also been looking for the plane!
https://redd.it/1ucd6rd
@datascientology | 88 |
| 8 | C++ tracker for small aerial targets
https://redd.it/1u9eder
@datascientology | 114 |
| 9 | Next-Latent Prediction Transformers R
Microsoft Research Preprint
Next-token prediction is myopic. What if transformers learn to predict their own next latent state?
Microsoft Research present Next-Latent Prediction (NextLat): a self-supervised learning method that teaches transformers to form compact world models for reasoning and planning. It also unlocks up to 3.3x faster inference via self-speculative decoding!
On top of next-token prediction, NextLat trains the transformer to predict its own next latent state given the current latent state and next token.
NextLat has a few key benefits:
1. Representation Learning: NextLat encourages transformers to compress history into compact belief states.
2. Better Data Efficiency: predicting in latent space provides denser supervision than predicting one-hot tokens.
3. Faster Inference: via recursive multi-step lookahead.
I'm super excited about this work. Please do check it out below:
💬 Blog: https://jaydenteoh.github.io/blog/2026/nextlat
💻 Code: https://github.com/JaydenTeoh
📝 Paper: https://arxiv.org/abs/2511.05963
https://redd.it/1u84mio
@datascientology | 96 |
| 10 | How does the ML community view evolutionary algorithm research? Career implications of an EA PhD? D
How does the ML research community feel about evolutionary algorithms? Should I do a PhD in this area?
Quick remark: I know some people in the ML community dunk on evolutionary algorithms because there’s often a better optimizer, but they do have their place, which is what researchers in my community aim to quantify.
Background:
I just finished my first year as a mathematics master’s student working on the theory of evolutionary algorithms (EAs)/randomized search heuristics. I’m fortunate to be on a research assistantship and have already coauthored several papers in strong conferences in our area.
I’ve always been more interested in classical ML/deep learning theory but haven’t had anyone to work with. Researchers in my field, including my advisor, occasionally publish in mainstream ML venues such as AAAI and NeurIPS, but it’s primarily the EA venues.
For a while now, I’ve been independently studying deep learning and statistical learning theory, and I have found intersections with my current research that I plan to pursue for my thesis.
With my current CV, it’s looking like I could get into some of the best PhD programs in my area, but I’m wondering if I should try to go to a more ML-centric PhD, even if it means going to a less prestigious institution/group for the sake of my career.
I’m not sure yet what I want to do after my PhD and a possible postdoc, but I want to keep myself competitive for top-tier opportunities.
What implications might doing an EA PhD have for my career? With strong EA publications, could I get into a good ML PhD program if I pitch myself appropriately? Could staying somewhat outside mainstream ML actually be a good career move, given how competitive and crowded ML has become?
https://redd.it/1u66q3l
@datascientology | 95 |
| 11 | Which software or tools are used to make these kinds of diagrams or animations?
https://redd.it/1u3bh7r
@datascientology | 103 |
| 12 | Introducing Papers Without Code P
Hi, Niels here from the open-source team at Hugging Face.
I've recently relaunched paperswithcode.co as a source for finding the state of the art (SOTA) across various AI domains, from 3D generation to AI agents. This is done by automatically parsing research papers published on arXiv/Hugging Face, enabling leaderboards to be created. See BrowseComp below as an example (a scatter plot and a table are available for each benchmark).
\- Scatter plot (you can hover over the dots to see the models):
https://preview.redd.it/9rz2r3ffcf6h1.png?width=2880&format=png&auto=webp&s=b3f8e7a870802f6ef8227ecc0619e9e1057554b0
\- Table:
https://preview.redd.it/qoqriddw5f6h1.png?width=2862&format=png&auto=webp&s=a0034574f693847537037013672fb61daf27b16e
As you can see, I've added support for viewing evals for closed-source models, too, given that many benchmarks are nowadays dominated by them, like GPT-5.5 and Mythos 5. You can always disable viewing closed-source evals with a toggle or in your PwC settings:
https://preview.redd.it/p3k6jt6q6f6h1.png?width=1582&format=png&auto=webp&s=40149e51d6b326a77e53e33baf70d9850b3de365
When you turn them off, here's what the open model leaderboard looks like:
https://preview.redd.it/tg42sin36f6h1.png?width=2838&format=png&auto=webp&s=1330a117ae9b4e0ce6d459493ae9e8f64107310a
Closed-source papers are treated as regular "papers", although they can be any source, like a blog post (given that PwC supports submitting any source beyond arXiv). See the GPT-5.5 or Mythos 5 papers as examples, with their evals at the bottom. Notice the "closed" tag on their evals. Hence, you could jokingly call these "papers without code".
Let me know what you think of this, and whether anything needs to be changed or added!
Kind regards,
Niels
https://redd.it/1u1wq0a
@datascientology | 132 |
| 13 | Greater than 80% of researchers at CVPR are chinese. This speak volumes on the chinese nexus in research, and something needs to be done about it. D
There are coordinated efforts where people have favoured and jeopardised the double blind review process.
No doubt out of these 80% there are great talent but we have to acknowledge that non chinese have been sobotaged and this was also reflected in the recent leaks of the reviewer data from the top ml conferences (won’t name them but they start with i).
I have also personally faced such discrimination and had a discussion on the subreddit asking others if they have witnessed something similar. It was shocking to know that this is occurring on large scale.
The question is how do we stop it, or highlight this? We have to preserve the sanctity of the research.
https://redd.it/1u00gdg
@datascientology | 127 |
| 14 | لا يوجد نص... | 139 |
| 15 | 3D Reconstruction from Video - Class Final Project
https://redd.it/1tx9oss
@datascientology | 147 |
| 16 | Built an open-source hub of CV notebooks for almost every real-world use cases and Models
https://redd.it/1tvgjg0
@datascientology | 141 |
| 17 | LibreYOLO v1.2.0 epic release: 16 model families now supported
https://redd.it/1tt6pl8
@datascientology | 145 |
| 18 | Its been a decade
https://redd.it/1tqv73m
@datascientology | 176 |
| 19 | Deep Neural Network that turns any Image into a Playable Game ! All on consumer GPUs.
https://redd.it/1tom3oa
@datascientology | 156 |
| 20 | How do ML practitioners select hyperparameters, architectures, etc for self-supervised representation learning when the loss is non-monotonic? D
Non-contrastive SSL methods like BYOL/JEPA/data2vec seem promising, but I have no idea what is being learned, or how well; it’s models all the way down. Maybe I’ve got supervised tasks for which I’d like to see transfer, and I can evaluate linear probe/KNN results during training, but that seems like a way to efficiently abuse researcher degrees of freedom.
I know RankMe is meant to help address this: embed some data and SVD the embedding matrix. A healthy learner should produce an embedding with a high effective rank.
But JEPA methods already require an entropy-collapse term like Barlow Twins/SIGREG, so the RankMe criterion just becomes part of training. It gets absorbed into a loss which wasn’t monotonic to begin with, and I ought to be able to inflate it by increasing the penalty weight. Surely it’s no longer an effective criterion, right? What else is there?
https://redd.it/1tmprdm
@datascientology | 146 |
متاح الآن! بحث تيليغرام 2025 — أهم رؤى العام 
