Henok | Neural Nets

الذهاب إلى القناة على Telegram

Group: https://t.me/neural_netss_chat

2 248

المشتركون

-324 ساعات

-87 أيام

-3130 أيام

922

عرض المشاهدات

لا توجد بيانات24 ساعات

لا توجد بيانات48 ساعات

41.01%

معدل المشاركة

لا توجد بيانات

المشاركات في اليوم

Ads index

beta

أرشيف المشاركات

2 248

Repost from Birhan Nega

የእግርኳስ ሆርሙዝ ሰርጥ 🔥

2 248

Repost from Dagmawi Babi

Introducing Papers API • scholarxiv.com/developers We hosted 3,032,697+ (3M+) papers so you don't have to explore research at the rate of one query per 3 seconds (arXiv's API limits) — instead you can explore research at 3,600 queries per hour. That's one query every single second, everyday! With that kind of rate you can imagine what kind of research agents and products you can build! Filter by title, author, category, abstract, or date. Get clean, structured metadata containing titles, authors, abstracts, categories, PDF links and more without parsing XML or rate-limiting yourself. Alongside the Papers API we're also launching our developers platform. This's where you can manage your API keys, track usage, try the playground and explore the documentation. There's a lot more we're building and we can't wait to see what you're going to build with this. #Launch #PapersAPI #DevelopersPlatform @ScholarXIV

2 248

Today in ባህር ዳር

2 248

Diogenes the Cynic's story is interesting. If I was a member of the parliament I'd have entered with lit lantern😂 https://penelope.uchicago.edu/encyclopaedia_romana/greece/hetairai/diogenes.html

2 248

DataSpires_JD_CTO_Technical_Lead-3.pdf0.96 KB

2 248

DataSpires_JD_CTO_Technical_Lead-3.pdf0.96 KB

2 248

so for world cup, I predicted 2/2 correct predictions 😎, I mean who is stopping me know. I dare you to ask me who the next PM is going to be

2 248

It's actually very fast, but not sure how much the throughput vs quality trade off is

2 248

Diffusion Gemma is really nice. Making me go back to abandoned diffusion based text generation projects https://deepmind.google/models/gemma/diffusiongemma/

2 248

New 4B translation model from Hasab for few Ethiopian langs. It's good to see almost everyone who is working on AI/ML in Ethiopia is releasing something. This will help a lot to make a progress. Now tag Ethiopian AI Institute to release something too 😁 https://huggingface.co/hasab-ai/YehaTranslate

2 248

I'll share the results of my exploration this weekend. Hopefully a long report

2 248

My weekly GPU usage📊

~1267.2 kWh

According to Gemini this can power an average household refrigerator for 14 to 16 months, or driving a standard electric vehicle (EV) for about 6,000 kms The good thing is the cluster runs on renewable energy so close to zero carbon footprint and recycles the heat back.

2 248

Oh wow

2 248

Saw a post about ScholarXIV from Babi, so you heard about fake citations right, well ScholarXIV can help with that🔥 But open research problem could be how can you ground the LLMs so they can cite exactly from the paper without altering results, no errors or attach the citation to the wrong sentence etc Maybe methods like on-policy distillation could help here, let a verifier identify where the model’s claim diverges from the source, then train the model to down weight those unsupported claims, wrong source sentence citations etc.

2 248

So I got stuck making an objective function with an anti collapse loss so the embedding space doesn’t collapse into a useless low dimensional subspace 😞 maybe venting here helps lol, tips are also welcomed

2 248

But we had a retweet from him😁, not as good as Arsenal's win but a win is a win

2 248

I've been a fan of Sasha for quite a while. I even tried emailing him and applied to be advised by him in 2022 when he was at Cornell Tech. I didn't get a reply on that email 😂. He also got great students, he now works at Cursor, if you follow this channel I also post some of the puzzle he made like the GPU puzzle etc. This was a post from Dwarksh yesterday. Don't be shy just email any researcher, start today, who cares. You either get better mentors, or watch that person get famous and flex saying I've emailed that guy before 😂

Recently met @srush_nlp and he started giving me an impromptu lecture on how targeted on-policy self-distillation works. I asked him if I could record it on my iPhone. The basic idea is this: if the model made a mistake at some point in the rollout (for example, calling a tool that doesn't exist), we want to discourage this specific error, but we don't want to just learn from the final reward, because it's a very noisy signal spread out over the whole trajectory. So we have another model read this trajectory and figure where the error was made. It simply inserts some hint tokens to the part of the trajectory right above where the mistake was made. Now with these injected hint tokens, have the model run a forward pass. You're not having to regenerate a new rollout - aka no new decode required. The hint causes the model to assign lower probabilities to the error tokens. You then trains the original model to match these new probabilities, teaching it to downweight that specific mistake.

https://x.com/dwarkesh_sp/status/2062353335529935114?s=20

2 248

Repost from Ethiopian Cursor Community

Gheero(formerly iCog) is opening applications for its Applied AI & Machine Learning Residency Program, an 8-week program for people who want to build real AI systems, not just study the concepts. Residents will work on practical AI and machine learning projects, learn how models are designed, trained, evaluated, and integrated into real products, and get mentorship from experienced builders. The program focuses on applied AI, engineering discipline, problem-solving, and technical growth. Strong participants may also be considered for internship or full-time roles after the residency. To apply, send your CV and relevant documents to recruitment@gheero.et with the subject line: Applied AI & Machine Learning Residency Program. Read more here

2 248

Let's talk about Metro So here they have the Montreal Metro, which started in 1966. It's so big and complex and serves over 1 Million people daily🤯. Now the crazy thing is, in today's estimate the cost to build it, is around $1.5 billion. This money is about 1/3 of the total cost of Renaissance Dam(I saw it took about $5 billion). So they spent this amount of money to make an infrastructure for just one city. Now I started thinking what a metro like this in Addis could solve. You could go from Alem Gena to Lege Tafo or from Saris to Gulele easily and fast. More over you will have a predicable travel time and solves የሃበሻ ቀጠሮ at least in Addis hehe

2 248

Summer is here btw