Should you be hiring AI Engineers?
Anvisha Pai
Co-founder @ Dover
October 17, 2023
•
3 min
Before the LLM-fueled AI boom, there were clear and well-defined roles for AI/ML professionals.
Stanford Professor Chip Huyen summarized it elegantly in 2019:
In the Research realm, Research Scientists work on novel AI/ML research; and Research Engineers help them run experiments and validate results.
In the Applied realm, Data Scientists work on developing models; and ML Engineers deploy models into production, occasionally working with ML Infra/ML Platform Engineers.
When it came to shipping user-facing features that used these models, typically a Software Engineer, Full Stack Engineer or Product Engineer would hop in to help.
Note: For the purposes of this article I’ll use “Full Stack Engineer” as shorthand to refer to all 3.
In the old world, this worked well. Many AI/ML applications pre-2022 — like news feeds, auto-tagging your friends in photos, searching documents, speech to text etc — had Data Scientists, ML Engineers and Full Stack Engineers working together to produce them.
The center of the diagram above is purposefully blank. There was no role at the intersection point of all these disciplines, and for good reason. Building a skillset in model development or deployment required significant effort and specialization, and was something that people would focus their entire career on.
When the GPT-3 API came out in Nov 2021, it drastically reduced the specialized skillsets needed to ship AI features to customers.
And then, in March 2023, something even more disruptive happened: for many use cases, GPT-4 eliminated the need for model development and model deployment entirely.
What had previously taken a Data Scientist to develop and an ML Engineer to deploy could now be completed by a Full Stack Engineer in a matter of weeks.
Here’s a quick example: At Dover, we had trained a classifier model on TensorFlow and deployed it to Google Cloud. When the GPT-3 API launched, we were able to quickly plug it into the classifier. They even had a fine-tuning API (which has now been re-released) so we were able to easily pass our data in to make the models better.
When GPT-4 came out, we didn’t need to fine tune the model anymore and we could completely get rid of TensorFlow.
We experienced the shift in required specialization firsthand – our generalist Full Stack Engineers were now shipping AI features on their own.
Let’s take a brief detour to consider the role of the Prompt Engineer. It seemed as if for a second, that was the new “it” role at companies building in, or on, AI.
Source alexandr_wang and rowancheung on X
Anthropic’s job description in particular was telling:
They admitted: “Given that the field of prompt-engineering is arguably less than 2 years old, this position is a bit hard to hire for!”
Notably, the role didn’t require any Software Engineering experience.
For a second it seemed as if Prompt Engineers might replace the role of a Full Stack Engineer in shipping customer-facing AI applications.
However, the role doesn’t seem to have spread beyond a few orgs. There are <100 open roles with the title “Prompt Engineer”, vs 3500+ roles with the title “AI Engineer”.
More on this below but the tl;dr is that today, companies still need to hire Full Stack Engineers to ship customer-facing applications, and we’re not (yet?) at the stage where Prompt Engineering is all you need.
And, that’s where the “AI Engineer” comes in.
AI and Full Stack Engineering skills overlap heavily
With model development and model deployment becoming a “solved problem” for a broad swathe of businesses, that leaves us with the question — can a skilled Full Stack Engineer build AI-driven products? Do they need any further specialization to do this effectively?
Today's AI Engineering skills can be grouped into 3 buckets:
Prompt Engineering: see above, basically this involves prompting the LLM to get the inputs and outputs tuned just right and achieve your desired results.
RAG (Retrieval-augmented generation): Growing in popularity, this is when you retrieve facts from an external knowledge base and feed it into an LLM to do stuff like answer questions or search, often using embeddings.
Agents: An emerging area of focus, you can compose together LLMs and “tools” to handle multi-step workflows, such as a customer-support chatbot that can issue refunds for you.
As you can see above, some of the most salient areas that make up “AI Engineering” actually just require Full Stack Engineering skills: basic system and API design, using third-party libraries, logic & problem solving etc.
And this has borne out: despite there being <1000 “AI Engineers” in existence, many companies with strong Full Stack Engineering talent were quickly able to ship engaging products using this new AI technology.
Take for example Vanta (a Dover customer) – they just launched Vanta AI to automate things like filling out compliance docs – without anyone working there with the explicit title of AI Engineer.
A possible exception here is RAG. Retrieval-augmented generation is very similar to Search as a domain. Today, Search Engineers tend to be a specialization within software depending on the complexity of the use case, and RAG could end up as a similar specialization, but it’s too early to tell.
So why hire AI Engineers?
Lots of companies have been able to ship compelling AI features without hiring specialized talent.
So, do you need to hire an AI Engineer before you can ship AI features? The answer to us is a clear no.
But, it might still make sense to hire an AI Engineer. Here’s why:
1. “Learning” is a skill and right now, AI Engineering is all about learning
I’ve talked to a ton of engineers who work in both big tech and at small startups, and it’s surprising how many are willfully ignorant about AI, or don’t use AI tools like GitHub Copilot or even ChatGPT.
The field is rapidly evolving. Last year, people weren’t talking about Agents or RAG. We don’t know what will come out in the next few months, let alone years, that will upend the state of the art and bring in new engineering challenges, frameworks and techniques.
Hiring engineers that are interested and capable of staying on the bleeding edge is essential for any company trying to stay ahead.
Clearly outlining that in the role title will ensure you attract engineers that are truly interested in constantly learning about this new area.
As a small tip: in our phone screens (can be done by a recruiter) we ask this question:
How do you keep up with tech news and developments?
Then we select for candidates who actually keep abreast of advances in their field.
2. Branding helps to attract the right kind of Eng talent
At Dover, we’ve seen small tweaks to a role title (without any changes to the job description) lead to massive differences in candidate interest.
Good recruiting is like good marketing. An attractive role title and a compelling set of responsibilities is half the battle when it comes to hiring great talent.
Calling your open job req “AI Engineer” or “Software Engineer, AI” accomplishes that. It is an attractive title that catches someone’s eye. It also paints a compelling picture about the type of company and the type of role.
Things to keep in mind
If you’ve decided that you would indeed like to hire an AI Engineer, here’s some things to keep in mind:
1. Be wary of candidates without engineering chops
Don’t hire someone that can’t code to ship product!
Even though you might be excited about that candidate with a PhD or a background in Data Science, make sure that you think critically about the actual role. How much model building do you need? If you’re relying on LLMs like GPT or Claude, you’ll be better served by a strong engineer eager to ship stuff.
2. Finally: Don’t be the person who requires “10 years of AI Engineering experience”
AI Engineering is an entirely new role and field that’s still emerging. Many of the techniques listed in this article didn’t exist until recently. Trying to source people with specific titles or skillsets will lead to you missing out on great talent.
Our recommendation is to optimize for gritty, self-motivated individuals with strong Full Stack fundamentals who are eager to take on the challenge of learning and defining a new field.
We’re excited to see where AI Engineering goes from here!