Uncategorized – Biebernomics

This won’t be a super organized post, but I wanted to put this out into the ether. As I’ve been on the job hunt I’ve experimented with using chatGPT to help me craft cover letters (example here ).

I thought that a demonstration of using a relatively new technology would demonstrate creative thinking and maybe help me get my application to the top of the pile.

However, it seems like the chatGPT output is missing something, because I seem to be having better luck by crafting cover letters myself. I haven’t found it fruitful to collect data on this (frankly my time is better spent writing cover letters and sending applications). It might be something I revisit down the road.

However, I want to automate everything I can, and to that end I’ve created a word template that uses fill-in fields to allow me to automatically insert information into the template. This has the benefit of saving a bunch of time but still enabling the letters to be individualized to each job and company. You can read more about how to do this in MS Word here.

It’s official, Databricks has released their English SDK for Apache Spark.In short, this tool looks like it makes writing Spark code as simple as writing plain English instructions.

I haven’t tested it yet, but I have been using ChatGPT for some time to help me write cover letters for job applications. I’m assuming that the fine folks at Databricks have released a product that will make a lot of data engineering tasks way too easy, and make it easier for folks without a coding background to create ETLs, or even whole data infrastructures, without the help of people experienced in writing code.

THIS IS A GOOD THING, and it highlights an important thing to be mindful of , whether you’re a data engineer, data scientist, or marketing executive.

Memorizing the use of a complex tool (like Spark or SQL) does not make you a good engineer. Knowing how to use a complex tool to get results that are asked for from your boss or a stakeholder doesn’t make you a good engineer.

Understanding relationships, whether mathematical, cause/effect, logical, or interpersonal, make you a good engineer.

What do I mean by this? Let’s use this new Spark tool as an example. When it comes to data engineering, the important part isn’t understanding how to write the correct Spark. It’s understanding how the data you’re manipulating is being collected, how it will be used, what it represents in the real world, and how it can be efficiently stored, retrieved, transformed, and related to other data in your infrastructure.

This isn’t exactly a hot take. I’m sure something similar is being repeated on a thousand different blogs right now. But I wanted to weigh in because I also see a lot of doom-posting about the end of data engineering as a profession. On the contrary, tools like this (and Copilot, and ChatGPT, etc.) are just going to enable people to apply more computational power to their projects. Technical folks like myself will be necessary to help non-technical folks avoid mistakes, refine and optimize processes, and fine-tune plug-and-play LLMs for specific business uses. To me, this is the democratization of complex computation; folks with real expertise in analytics will still be necessary to act as subject matter experts. But we’ll have to spend less time memorizing syntax, and more time understanding what we’re trying to build.

Category: Uncategorized

Cover letters, chatGPT, and MS Word

GPT, English SDK, LLMs and the future