cookie

ما از کوکی‌ها برای بهبود تجربه مرور شما استفاده می‌کنیم. با کلیک کردن بر روی «پذیرش همه»، شما با استفاده از کوکی‌ها موافقت می‌کنید.

avatar

There will be no singularity

Smartface, technologies and decay @antonrevyako

نمایش بیشتر
پست‌های تبلیغاتی
2 084
مشترکین
اطلاعاتی وجود ندارد24 ساعت
اطلاعاتی وجود ندارد7 روز
-830 روز

در حال بارگیری داده...

معدل نمو المشتركين

در حال بارگیری داده...

Friends from Luna Park are looking for a Data Engineer — could be you or someone you know (could be DS, Python dev or ML engineer if you ask me) Palabra.ai — the first ever real-time voice interpreter: starting as a tiny team of five engineers, they built a speech-to-text solution that works 50 times faster than OpenAI's Whisper! The interpreter prototype is releasing in a month and will work in Zoom with a two-second delay — right now it’s one of a kind! They are now looking for a senior engineer, who will write and optimize complex multi-node data pipelines, work on large-scale data scraping and manage datasets that include hundreds of thousands of hours of audio data. Key requirements: 🟣5+ years of industry experience as a software developer/data engineer; 🟡Python skills (including modern backend frameworks, low-level asyncio, multithreading/multiprocessing); 🔵A good grasp on neural networks and related tools (NumPy, PyTorch, audio-related libraries such as torchaudio, librosa, etc); 🟢Experience in deploying, orchestrating, and scaling multi-node pipelines in the cloud. Nice-to-haves: 🔘Compilated languages such as Go, C/C++, and Rust; 🔘CUDA or other GPU-related frameworks experience; 🔘Audio processing-related experience. Salary is $70k-100k, plus equity up to 1%. It’s a fully remote position. To apply or learn more, reach out to my Luna Park buddy Fedya @owlkov
نمایش همه...
👍 8
Photo unavailableShow in Telegram
👎 1🥰 1
Snowflake's hidden gem - CTE macros We all like to use CTEs (Common Table Expression). It makes our code cleaner and sometimes help to speed up queries. But somehow, Snowflake documentation hides from us one beautiful behavior of CTE that will make your life even more convenient. What does the documentation say?
A CTE (common table expression) is a named subquery defined in a WITH clause. You can think of the CTE as a temporary view for use in the statement that defines the CTE
In other words: - we can only use SELECT in the CTE - the result of the CTE is a view-like object Surely many of you have used the CTE to define some constants that are used further in the query. For example:
 
WITH
  var_cte AS (
    SELECT 'Snowflake' AS vendor
  )
SELECT *
FROM t
WHERE
  vendor = (SELECT vendor FROM var_cte)
Tolerable, but not perfect. Especially scalar subqueries. Scalar subqueries are bad practice. Avoid using them! Can this be done in a more elegant way? Yes! Look at this:

WITH
  var_cte AS ('Snowflake')
SELECT *
FROM t
WHERE
  vendor = var_cte;
Wow! The code became easier to read, and we got rid of scalar subqueries at the same time! What if we want to use several values at once? We can make an object:

WITH
  var_cte AS ({'vendor': 'Snowflake'})
SELECT * FROM t WHERE
  vendor = var_cte:vendor;
Or here's the IN analogue:

WITH
  var_cte AS (['Snowflake', 'Bigquery'])
SELECT * FROM t WHERE
  ARRAY_CONTAINS(vendor::VARIANT, var_cte)
Although it's not so beautiful anymore…
It turns out that CTE can be not only a view-like object, but also a scalar value! Very cool, but even this is not a final:

CREATE TABLE t AS SELECT 1 AS a, 2 AS b;

WITH
  var_cte AS (a+b)
SELECT
  var_cte
FROM t;
As a result, we will get a table with a var_cte column and a value of 3. I.e. CTE is not only a view-like object, and not only a scalar value, but also an alias to any expression! Here's another example:

WITH
  var_cte AS (SUM(a))
SELECT
  var_cte
FROM t;
Yes, you can use any function calls there, including aggregate function calls. And even that works too:

WITH
  var_cte1 AS (a),
  var_cte2 AS (var_cte1+b)
SELECT
  var_cte2
FROM t;
And like a function's argument:

WITH
  var_cte AS (a + b)
SELECT
  ROUND(var_cte/2, 0)
FROM t;
Are there any downsides? Unfortunately, yes… First of all, CTE macros refuse to work when you use them in UNION and inline FROM queries:

-- doesn't work!

WITH
  var_cte AS (a+b)
SELECT var_cte FROM t1
UNION
SELECT var_cte FROM t2
;

-- and here

WITH
  var_cte AS (a)
SELECT * FROM (
  SELECT var_cte FROM t
);
Maybe Snowflake engineers will finish this functionality and CTE macros will become possible to use everywhere. And second, none of the data lineage tools will tell you that. But the good news is that in dwh.dev we take CTE macros into account at compile time and display all relevant connections in lineage! PS: I found out about it quite by accident from the last example in the documentation of the ENCRYPT_RAW function PPS: thumbs on at linkedin
نمایش همه...
👍 4 1
نمایش همه...
Lukas Eder (@lukaseder) on X

SQLite be like: "meh, good enough

نمایش همه...
It is the time to simplify Observability!

Instead of more data, we need actionable insights to identify the issue's root cause

نمایش همه...
Ryan Els (@RyanEls4) on X

Let me introduce you all to SQL 🤭

😁 4🌭 1
نمایش همه...
Tanmay Kulkarni | Data Engineer 🇮🇳 (@DataSuperNerd) on X

This is fantastic! 📊 @duckdb lets you run SQL queries directly on top of a pandas dataframe in memory. Pandas being so ubiquitous, this can be very handy in your transform pipelines.

👍 5🔥 2
نمایش همه...
Darren Baldwin — oss/acc (@DarrenBaldwin03) on X

I fixed it

😁 9👍 5🔥 1
2024 MAD (Machine Learning, AI & Data) Landscape https://mattturck.com/mad2024/ PDF: https://mattturck.com/landscape/mad2024.pdf
نمایش همه...
Full Steam Ahead: The 2024 MAD (Machine Learning, AI & Data) Landscape

This is our tenth annual landscape and “state of the union” of the data, analytics, machine learning and AI ecosystem. In 10+ years covering the space, things have never been as exciting and promising as they are today. All trends and subtrends we described over the years are coalescing: data ha

🔥 1🤔 1
نمایش همه...
42.parquet – A Zip Bomb for the Big Data Age

A 42 kB Parquet file can contain over 4 PB of data.

👍 3🔥 3