Agentic architectures and organisational knowledge

Risk Management01 Oct 2025

Ramprakash Ramamoorthy at Zoho predicts the rise of proprietary intelligence in the next generation of enterprise AI

As generative AI continues to be embedded in organisations, the conversation is shifting from experimentation with public models to a more urgent consideration that proprietary data fed into large language models (LLMs) may not remain proprietary. According to McKinsey ‘The State of AI’ report, 78 per cent of organisations report using AI in at least one business function, but as LLMs become more deeply integrated into enterprise workflows, the question is no longer only how to use them, but whether any potentially sensitive data powering them is secure.

The technical capability of a model, its parameter count, training technique or architectural novelty, is only one part of the equation. In enterprise environments, the real differentiator is not the model itself, but the relevance and utility of the data it is paired with. Without a privacy-first design, the value of proprietary data can quickly be overshadowed by the risk of losing it.

Language models function by predicting text based on learned patterns. In consumer or open-domain use cases, these patterns are often derived from a vast and generalised body. In business settings, success depends on interpreting and acting on precise, internally generated information without risk of data breaches. Ultimately, unless the LLM is built with a focus on privacy, once proprietary data enters the system, it may no longer be exclusively yours.

Embedding organisational knowledge in LLMs

Proprietary data carries with it an understanding of how an organisation thinks and operates. Embedding this data into model pipelines, whether through training, fine-tuning or retraining, can enable systems to reflect institutional workflows and decision-making that are often undocumented but critical to business execution.

For example, a language model integrated with CRM and sales data can create context-aware recommendations that are tailored to historical deal flows or customer behaviour. A model tuned on internal support tickets and knowledge base content can provide precise, pre-validated responses to common customer issues. Legal and finance teams can use models grounded in contract data or regulatory records to accelerate document review, summarisation, or risk assessments.

Without a privacy-first LLM, embedding this data risks unintentional exposure, and if hosted where data is pooled or outputs are not isolated, insights derived from one business’s information could potentially influence results for another. Processing data within a secure, private environment, without contributing it to any shared training corpus, is therefore essential.

Integrating proprietary data

There are multiple technical strategies for pairing proprietary data with LLMs. One of the most widely adopted is retrieval-augmented generation, where the model uses embeddings to pull relevant documents or data points at inference time. This allows the system to maintain a relatively lean base model while injecting real-time context from approved sources. For example, an AI assistant supporting customer service could retrieve relevant sections from internal troubleshooting guides or account histories before generating a response.

Another approach involves fine-tuning a model on curated, domain-specific datasets. This technique creates a more deeply embedded understanding of organisational logic, vocabulary, and task framing.

Fine-tuning can be applied to small or medium-sized models, which are often more manageable in enterprise contexts, using parameter-efficient techniques such as LoRA or adapters. This not only improves performance on specific tasks but can also reduce latency and compute costs by tailoring the model more closely to its intended operational environment.

While effective, both methods can introduce risks. RAG requires airtight access controls to prevent unauthorised retrieval. Fine-tuning must take place in environments where proprietary datasets are never commingled with others. Once embedded in shared model weights, data is nearly impossible to remove, creating compliance challenges under regulations such as GDPR.

Contextual intelligence at scale

Proprietary data doesn’t just improve accuracy; it enables a new class of AI use cases that depend on high-context reasoning. These include automated report generation, personalised employee assistance and operational forecasting, and in each case, the model must not only understand language but also reflect the intent, constraints and logic specific to the organisation.

The right privacy-first LLM turns passive information into an active intelligence layer. It interprets, applies, and evolves proprietary data without leaking and can ensure that updates to business rules or compliance requirements are instantly reflected without benefiting outside parties.

For organisations with complex internal ecosystems and with multiple departments, legacy systems and evolving compliance requirements, proprietary data becomes the glue that allows models to reason coherently across silos. It supports the creation of task-specific agents, domain-specialised copilots, and even full digital workflows that are aware of historical context and able to make contextually valid decisions without compromising data.

The long-term impact of this approach is strategic. LLMs powered by privacy-protected data become part of the organisation’s infrastructure, extensions of core knowledge systems that evolve in parallel with the business itself. Rather than asking users to adapt to a generic AI assistant, proprietary data allows AI to adapt to the users, the environment, and the intent behind each task.

In 2024, global enterprise spending on generative AI reached $13.8 billion, more than six times the $2.3 billion invested in 2023, as businesses continue to integrate the technology into their business operations.

As enterprise AI matures, success will hinge on both technical capability and trust. Proprietary data is not a side input; it is the foundation of meaningful, usable, and secure enterprise AI. Organisations that understand how to harness their unique datasets, structure them appropriately, and align them with model behaviour will be best positioned to realise the next generation of intelligent automation.

LLMs should be designed with this in mind.

Ramprakash Ramamoorthy is Head of AI Research at Zoho

Main image courtesy of iStockPhoto.com and BrianAJackson