AI

Mastering AI prompt engineering with James Cranwell

James Cranwell blog
Mastering AI prompt engineering with James Cranwell
9:14

At 5app, we believe that the power of AI lies not just in the technology itself, but in how you guide it. Over the past few months, I’ve been diving deep in the world of prompt engineering. That’s meant testing, refining and pushing the limits to unlock the full potential of Helix, our new AI skills intelligence platform. 

From surfacing hidden skill trends to generating insights that drive action, the way we prompt Helix makes all the difference. I’ve been inundated with questions about how we’ve mastered prompt engineering to build Helix, so I wanted to share what I’ve learned so far, the challenges we’ve faced and how smart prompting is shaping the future of skills at work.

 

What is prompt engineering?

Put simply, prompt engineering is about designing and refining the ‘prompts’ you use to get what you want from your AI tools. A well-crafted prompt will help you get a better response or more useful output, and prompt engineering is one of the most useful AI skills for L&D professionals right now.

Prompt engineering scares a lot of L&D professionals – wrangling with technology isn’t always our favourite thing to do! But despite the techy leanings, prompt engineering is much more closely aligned with UX writing than coding. Prompts are written in natural language, combined with some logic, so you don’t need to be able to code to write a great prompt. 

 

How does Helix make use of prompt engineering?

5app's Helix AI skills intelligence platform skill feedback shown on a tablet

5app’s 3Q Engine is built on prompts to identify skills, how they’re used and how frequently they show up within a meeting transcript. It analyses meeting transcripts to look for verbal indicators (phrases or specific words), which tell Helix how well you’re demonstrating a target skill. 

Of course, I can’t give away all of our secrets, but rest assured that we’ve spent months researching, testing and refining our Helix prompts to get the right results. The success of Helix as a product is heavily dependent on the quality of the prompt engineering underpinning it, which is why we’re committed to getting the prompts exactly right.

We know that these prompts will evolve over time, whether that’s because we’re adding new soft skills, improving the feedback you receive or keeping up with evolving GPT models. And when those model architectures change, our prompts will need to be reworked – not everything is transferable! That’s the beauty of prompt engineering – you’re never tied to a single way of doing things!

 

The practicalities of prompt engineering

If you’ve ever played around with AI tools like ChatGPT or Claude, you’ll know that they don’t always give you the result you were expecting. Sometimes they hallucinate (make things up), confabulate (produce plausible but inaccurate answers) or interpret your request differently to how it was intended.

One thing I found incredibly useful from the very beginning of our Helix experiments was keeping a ‘prompt diary’. It helps me spot AI quirks and misunderstandings in batches rather than one-offs, which helps me solve problems faster. I rely on the log of previous prompts to identify and remove issues. It can sometimes feel like playing whack-a-mole, as fixing one issue can inadvertently create a whole new one, which is why keeping a prompt diary is so important.

I approach prompt engineering like design thinking, so my prompt diary helps me keep track of drafting, testing, observing and iterating throughout the process. Mistakes early in a prompt can have significant effects downstream (like the butterfly effect!), so having a single source of truth to refer back to is a must for any prompt engineering project.

 

Overcoming prompting challenges

The Helix prompting is extremely complex, involving multiple stages spanning skills identification, scoring and suggesting next steps. Perhaps unsurprisingly, we’ve run into our fair share of challenges along the way!

 

Managing AI token usage

A major challenge for the 5app team has been ensuring that Helix identifies all skills usage within transcripts without pruning responses or missing examples due to the inherent context limitations of the foundational AI models. Initially, internal token limits within AI tools meant that Helix was selecting which examples to include for analysis, giving an incomplete insight into the data.

We had to get even more creative with the prompting and AI technology to ensure that everything was captured – specifically, that meant including imperatives like ‘must’ and ‘never’ to ensure all instructions are processed in full.

 

The ambiguity tax

Another challenge I’ve encountered several times relates to the concept of the ‘ambiguity tax’ in AI. If a prompt can be interpreted in multiple ways, the model will inevitably choose the wrong interpretation. For instance, if I ask the AI tool to ‘list three takeaways from this response’, it will often provide five bullet points with the top three in bold, as the model interpreted ‘three’ as a minimum rather than an exact number. 

Over the months, I’ve learned the nuances of the AI model, and now have a much better idea of how to write a precise, ‘lawyer-like’ prompt that will get me the exact response I want (most of the time!).

 

Specification gaming

Specification gaming refers to an interesting quirk where AI models follow rules, but exploit loopholes in their own processes to save tokens. For instance, I found that our AI model would often quote generic assessment criteria rather than providing tailored responses for specific skill usage examples. 

AI models default to finding the most efficient way to do things, but a product like Helix requires specific, personalised responses, which aren’t the easiest responses for the AI model to generate. To beat specification gaming at its own… well, game… I discovered that you need to be extremely specific in your prompting to prevent the model from circumventing the guardrails. 

 

Top tips for improving your AI prompt engineering

From the months I’ve spent working on Helix and improving my own prompt engineering skills, I’ve identified three tips that will help you improve the way you engineer AI prompts:

  • Use ‘pinning nouns’
    Using vague terms like ‘it’ or ‘they’ in an AI prompt is just an invitation for pure AI chaos. AI likes specific prompts, so remove any possibility of doubt by using ‘pinning nouns’ which refer to the specific target, rather than ambiguous terms which could refer to multiple variables.


    An example of a vague prompt might be:

“Quote where it shows accountability and explain how they demonstrate responsibility.”

This creates confusion: does ‘it’ mean the phrase, the turn, or the transcript? Does ‘they’ mean the speaker, the team, or multiple skills?

This ambiguity could cause Helix to mis-tag skill usage, or worse, miss valid evidence.

Compare this with an example of an actual Helix prompt with pinning nouns:

“Quote where the speaker demonstrates accountability by using explicit ownership phrases, and explain how the speaker’s wording reflects responsibility in that transcript turn.”

Now there’s no doubt that:

speaker = the target speaker (per the 'speakerEmail' filter)
accountability = the named skill under assessment
speaker’s wording = the phrase captured
transcript turn = the text field that holds the context

  • Choose positive over negative wording
    Negative language, like ‘not’, increases the risk of misinterpretation for AI models. Instead of saying ‘Don’t do this’, tell the model what it should do. The easier you make it for the AI model to understand you, the more likely it’ll be you’ll get the results you want.
  • Set up tests with expected outcomes
    Before you put a major change live, run tests with ‘golden inputs’ and ‘golden outputs’. In other words, set up a test scenario where you predict exactly what response you should get from a specific prompt. If your response matches up with what you were expecting, you can be more confident that your prompt is sound. It also helps you catch any regressions or issues more easily than simply eyeballing outputs and hoping for the best.

 

Prompt engineering is a continuous process

5app's Helix AI skills intelligence platform shown on three devices

Just because the first version of Helix is now live (and in early access for a short while longer), that doesn’t mean we’re resting on our laurels! 

We’ll never stop working to further optimise the prompt engineering that sits under Helix – especially as new, more efficient, more advanced AI models are released. Helix’s prompts will always be a work in progress, and we’re committed to constantly adapting to keep up with the latest developments in AI. This includes keeping an eye out for ‘alignment drift’, where new AI models can potentially break previous solutions, requiring re-prompting and the ability to pivot at speed.

Want to see the results of our prompt engineering to date? Helix is still in early access, so sign up now to get your free month and find out more about your own soft skills.

 

Get your free month of Helix

We'd love to help you discover your soft skills! Sign up to our early access programme to get started with your free month – absolutely no credit card details required.

GIVE ME MY FREE MONTH

 

 

Similar posts