Kyle

ai, business, finance, genAI, tax, technology

The Most Advanced o1 Model Still Needs a Professional Behind It.

On December 5th, OpenAI released its most advanced o1 model to the public, complete with benchmarks comparing its performance to the preview version and the pro-level model. These benchmarks include correctly answering 4/4 questions on competition-level math, advanced coding challenges, and PhD-level science questions. You can see below how there is a significant leap in accuracy compared to its predecessor.

This is another big step forward in the growing capabilities of Large Language Models’ reasoning capabilities and I’m hopeful it can tackle both the routine and complex challenges that tax professionals face regularly.

So, let’s put it to the test:

I revisited an example, previously written about, testing the O1-preview model: a straightforward cash movement exercise involving routine treasury operations.

The Prompt:

Before I lose you by using tax specific jargon, here’s the issue in plain English:

A Canadian parent company has two subsidiaries: one in the U.S. and one in Belgium. The U.S. subsidiary collects money from U.S.-based customers but operates under strict rules where it can only keep a small profit; the rest of the cash belongs to the Canadian parent. The Belgian subsidiary doesn’t earn its own income and relies entirely on the Canadian parent to fund its operations.

The question is: What is the best way to move the cash collected by the U.S. subsidiary to the Belgian subsidiary, considering their roles and the parent company’s responsibility for funding?

My input:

“I need to move cash from a US sub to a Belgian Sub.

___

Context: Canadian Parent company named Canco is profitable.

US Sub is an LRD and collects cash from Canco’s US-based customers.

BE Sub is a cost-plus R&D entity that requires monthly funding for operations.

___

What options do I have? Please make a recommendation.”

The Outcome:

The AI’s response? Let’s just say it was less than ideal.

The AI had pieces of the answer scattered across its suggestions. It was almost like a treasure hunt, pulling bits of truth from different options while discarding irrelevant or problematic ideas.

The core issue was that the AI didn’t grasp the functional profiles of the entities before it reasoned through potential options.

Solution to the problem as posed:

The U.S. subsidiary collects cash from customers but, as a Limited Risk Distributor (LRD), has no rights to retain that cash. It operates under a fixed-margin model, where excess cash belongs to the Canadian parent. The Belgian subsidiary, on the other hand, has no direct relationship with the U.S. entity and relies entirely on funding from the parent company to cover its operations.

In this setup, the answer isn’t complex: cash should flow from the U.S. LRD to the parent company via transfer pricing mechanisms. The parent then funds the Belgian subsidiary through cost-plus payments for R&D services. Simple, right?

Simple if you’ve got years of experience operating in a multinational environment, but nearly impossible to solve if you don’t.

How can we improve?

Improvements to the Prompt

💡Greater Precision and Context: Providing additional context—such as clarifying that the U.S. subsidiary (US Sub) cannot retain cash beyond its fixed return—could have guided the model to the correct solution more efficiently. What seems obvious to a tax professional isn’t necessarily clear to the LLM.
💡Explicit Constraints: Including critical financial constraints upfront, such as restrictions on the BE subsidiary’s ability to take on debt, might have reduced unnecessary back-and-forth iterations.

Improvements for the Model

🔍Better Contextualization: The model could improve by recognizing and applying relevant business rules more accurately. For instance, in this scenario, identifying the role of the U.S. subsidiary as a limited-function entity (LRD) and its restricted cash rights is critical. (You could also direct the model’s attention to this through your prompt.)
🧠Enhanced Recognition of Functional Roles: The model reasoning needs a stronger ability to recognize and correctly apply the roles of entities within a group structure as foundational knowledge before brainstorming options. (Strong prompts or iterations can also help here)

While GPT offers valuable insights and accelerates workflows, it still requires professional oversight to fully address real-world scenarios. As the model evolves, integrating deeper contextual awareness will enhance its utility for tax and finance professionals navigating intricate multinational arrangements.

The final takeaway

Sure, the latest o1 model from OpenAI offers remarkable advancements in reasoning and efficiency; however, it is not a substitute for the expertise of tax and finance professionals. In moderately complicated scenarios like this, the model can provide very helpful starting points but you must always critically apply your own oversight and experience to get to the answer most suited for your situation.

The key takeaway is clear: AI enhances workflows, but professionals remain indispensable for applying the nuanced judgment, industry knowledge, and practical oversight needed to navigate complex financial challenges. You are important, you are critical in your role, you are not being replaced, but don’t let the comfort of job-security keep you from actively seeking engagement with these tools; they can make a world of difference in your professional life.

If you made it this far…thank you so much for your time and attention, much gratitude. 🙏

The Most Advanced o1 Model Still Needs a Professional Behind It.

The Prompt:

My input:

The Outcome:

How can we improve?

The final takeaway

Leave a comment Cancel reply

Subscribe to
our newsletter

The Most Advanced o1 Model Still Needs a Professional Behind It.

The Prompt:

My input:

The Outcome:

How can we improve?

The final takeaway

Share this:

Leave a comment Cancel reply

Subscribe to our newsletter

Subscribe to
our newsletter