ao link
Business Reporter
Business Reporter
Business Reporter
Search Business Report
My Account
Remember Login
My Account
Remember Login

AI, Retrieval Augmented Generation and unstructured data

Phillip Miller at Progress Software explains how organisations can activate their unstructured data to unlock AI potential

 

Organisations today struggle to manage and store their spiralling amounts of data. It is estimated that as much as 80% of this data is unstructured, which includes many of their documentary assets, from PDFs and PowerPoints to emails, images and sensor outputs. The challenge for utilising this data is that structured databases are unable to capture the value of the rich contextual details.

 

For decades, enterprises have focused their analytics strategies on structured data, which can be stored and analysed neatly in charts. Yet many of the best context-specific details that can produce meaningful business insights for Agentic AI are found in unstructured formats. This content is typically spread across many siloed servers, collaboration tools and archives. Their approach to reaping meaningful business outcomes from exploring this data needs to change.

 

As AI is rapidly implemented across all industries, the question of how to handle unstructured data has been brought into sharp focus. According to Grand View Research, the enterprise Agentic AI market is growing exponentially, with a projected CAGR of 46.2% from 2025 to 2030. Since their AI tools are only as good as the quality of their data, CIOs and data strategists are now scrambling to organise and activate these diverse data sources so AI models can tap into their full potential. Harnessing unstructured data is a necessity for unlocking next-level possibilities.

 

However, many tech teams don’t know where to start to be able to make sense of and trust this data for AI projects. Their companies have amassed this content across countless file shares, email servers, collaboration tools and archives. To add extra complexity to drawing insights, this siloed data remains unclassified, untagged and unconnected. They are aware that understanding how to act on this unstructured data empowers them to keep up with competitors in the era of AI.

 

There is a key method – RAG (Retrieval Augmented Generation) can turn this data into knowledge so that it can work with AI to enhance their decision making and drive innovation.

 

 

Why unstructured data matters

Understanding the context of raw information can inform actionable insights. For example, an email thread might reveal why a client chose to leave for a competitor; a PDF whitepaper might contain research findings that could trigger a new product line; a phone conversation transcript could highlight emerging customer needs. AI systems that can ingest data from these sources go beyond basic statistical analysis to deliver context-aware predictions and recommendations, often unveiling trends and insights not apparent in numbers alone.

 

In fact, the shift from ignoring their unstructured data to leveraging it for AI can completely redefine an organisation’s competitiveness. As human attention spans shrink and data volumes explode, intelligent systems that parse language, detect intent and uncover hidden patterns can significantly improve decision-making and drive innovation. Whether it is uncovering new revenue opportunities, identifying operational inefficiencies or predicting market shifts, context-rich data is fast becoming a necessity rather than a luxury.

 

 

Challenges of using unstructured data for AI

Unstructured data requires more than just AI ingestion. Large language models need context, such as metadata and relationships that indicate how documents, images and conversations relate to an organisation’s broader knowledge framework. This could mean classifying a PDF based on the project it belongs to, labelling a conversation transcript with relevant keywords or connecting these assets to structured data, such as customer profiles or transaction logs.

 

Giving unstructured data this context and labelling requires collaboration between engineers and AI specialists. Data governance is another issue entirely, as data can carry sensitive or proprietary information that must be handled with care.

 

Another challenge is organisational culture. Many teams are accustomed to dealing exclusively with structured data. When faced with unstructured formats, they often lack clear processes or tools. It requires thoughtful collaboration between domain experts, data engineers and AI specialists to define what matters and how each piece of content should be interpreted.

 

 

Turning unstructured data into knowledge

Transforming unstructured data into knowledge involves both technology and process. One emerging strategy is RAG, which dynamically pulls relevant content from unstructured sources and delivers it to generative AI models. While traditional monolithic systems need massive pre-labelled datasets, RAG architectures dynamically retrieve relevant documents and text snippets based on user queries, keeping AI responses current and factually grounded. This approach can significantly reduce the risk of hallucinations, where an AI model invents facts not backed by real data.

 

RAG combines GenAI with detailed, relevant data to deliver accurate, reliable and useful insights. It connects business data with generative AI models, adding specific context and meaning while identifying and reducing hallucinations in the AI’s response. This context often comes from taxonomies or ontologies, which help the AI understand the data.    

 

 

Building trust in AI-generated insight

As organisations stand on the brink of the AI-driven future, the ability to unlock value from unstructured data is no longer a technical aspiration; it is a strategic imperative. RAG is not just another tool in the AI toolbox; it is a paradigm shift in how businesses transform dormant data assets into dynamic sources of insight and competitive advantage.

 

By bridging the gap between siloed, context-rich information and generative AI, RAG empowers enterprises to move beyond surface-level analytics toward truly context-aware, trustworthy intelligence.

 

In a world where knowledge evolves at breakneck speed and business environments are in constant flux, RAG enables AI to remain relevant, accurate and grounded in reality, retrieving the right information at the right time for the right decision-maker. This approach doesn’t just reduce hallucinations or improve accuracy; it fundamentally redefines how organisations can adapt, innovate and lead.

 

The future belongs to those who can turn their unstructured data from a liability into a living, learning asset, fuelling AI that is not only smarter, but also more responsible, resilient and aligned with real-world needs.

 


 

 Phillip Miller is an AI Strategist at Progress Software

 

Main image courtesy of iStockPhoto.com and agsandrew

Business Reporter

Winston House, 3rd Floor, Units 306-309, 2-4 Dollis Park, London, N3 1HF

23-29 Hendon Lane, London, N3 1RT

020 8349 4363

© 2025, Lyonsdown Limited. Business Reporter® is a registered trademark of Lyonsdown Ltd. VAT registration number: 830519543