Skip to main content

Uniphore Customer Portal

3. How are the documents enriched in X-Stream?

In X-Stream, the document enrichment process involves several steps that enhance the usability and relevance of the ingested data. Here's how it works:

  1. Canonicalization: Once documents are ingested from various sources, they are converted into a standardized JSON structure. This step ensures that the data is in a consistent format, making it easier to process and analyze.

  2. Metadata Enrichment: X-Stream automatically adds contextual, descriptive, and relational metadata to the documents. This involves tagging the documents with additional attributes that provide more context, such as the document's topic, key concepts, or the relationship between different pieces of information.

  3. Annotation by AI/ML Models: Advanced AI and machine learning models analyze the documents to identify and highlight significant entities, relationships, and concepts within the text. These annotations help in better understanding the content and improving the accuracy of search and retrieval operations.

  4. Enrichment of Content: The unsupervised Large Language Model (LLM) further enriches the content by providing deeper insights, summaries, and connections within the data. This enrichment makes it easier for users to search for specific information and ensures that the retrieved information is relevant and accurate.

  5. Storage in Vector Form: The enriched content is then broken down into smaller chunks, which are stored in vector form. This vectorization enables advanced search capabilities, such as semantic similarity searches, where the system can retrieve relevant content based on the meaning of the query rather than just keyword matching.

This entire enrichment process is designed to transform raw, unstructured data into a highly organized and context-rich knowledge base, making it easier for users to access and leverage the information they need.