Skip to content

An unofficial LangChain based Retriever built using concepts from the Titans Neural Memory

License

Notifications You must be signed in to change notification settings

SenticNet/NMRet

Repository files navigation

Langchain Neural Memory Retriever (langchain-nmret)

This project implements a custom LangChain retriever, NeuralMemoryRetriever, designed to integrate stateful neural memory capabilities with standard vector store retrieval and LLM-based reasoning compression. It leverages the titans-pytorch library for its core neural memory component.

Overview

The NeuralMemoryRetriever combines several components to provide a more sophisticated retrieval mechanism:

  1. Titans Neural Memory Wrapper (TitansNeuralMemoryWrapper): Manages an instance of titans-pytorch.NeuralMemory, handling its state and providing methods to update the memory with new sequences and retrieve abstract guidance vectors based on query embeddings.
  2. Vector Store Contextual Memory (VectorStoreContextualMemory): Uses a standard LangChain VectorStore (e.g., Chroma) to store and retrieve recent, concrete contextual information based on semantic similarity.
  3. LightThinker Compressor (LightThinkerCompressor): An optional component inspired by the LightThinker concept. It uses a provided LangChain BaseLanguageModel to summarize intermediate LLM thoughts or outputs, creating a compressed textual representation and its corresponding embedding.
  4. Multi-Step Retrieval Process: Executes a configurable number of reasoning steps. Each step can involve:
    • Retrieving abstract guidance from the Titans Neural Memory.
    • Retrieving relevant recent context from the Vector Store Contextual Memory.
    • Generating an intermediate thought or refined query using a BaseLanguageModel.
    • Optionally compressing the LLM output using the LightThinkerCompressor.
    • Updating both the neural memory and contextual memory based on the step's activities.

Features

  • Integrates stateful, abstract neural memory (titans-pytorch) with standard vector retrieval.
  • Maintains recent context using a VectorStore.
  • Optional LLM-based compression of intermediate reasoning steps.
  • Configurable multi-step reasoning loop.
  • Flexible memory update strategies (e.g., update neural memory per step or only at the end).
  • Built on LangChain core interfaces (BaseRetriever, BaseLanguageModel, Embeddings, VectorStore).
  • Includes metadata sanitization for compatibility with vector stores like ChromaDB.
  • Robust handling of potential non-Document objects returned by vector stores.

Installation

This project is available on PyPI.

pip install langchain-nmret

Development Installation

  1. Clone the repository:

    git clone <your-repo-url>cd langchain-nmret
  2. Install dependencies: This project uses uv for package management. Ensure uv is installed (pip install uv).

    uv pip install -r requirements.txt # Or uv sync if using pyproject.toml dependencies directly

    Key dependencies include:

    • langchain-core
    • titans-pytorch (Requires separate installation/setup if not on PyPI or requires specific version)
    • torch
    • numpy
    • A VectorStore implementation (e.g., langchain-chroma)
    • An Embeddings implementation (e.g., langchain-community, sentence-transformers)
    • A BaseLanguageModel implementation (e.g., langchain-openai, langchain-huggingface)

Usage Example

importtimeimportuuidimportnumpyasnpimporttorchfromlangchain_core.documentsimportDocumentfromlangchain_core.embeddingsimportEmbeddingsfromlangchain_core.language_modelsimportBaseLanguageModelfromlangchain_core.vectorstoresimportVectorStore# --- Assume necessary imports from the package ---fromlangchain_nmretimport ( NeuralMemoryRetriever, TitansNeuralMemoryWrapper, VectorStoreContextualMemory, LightThinkerCompressor, ) # --- Mock/Dummy Components (Replace with actual implementations) ---fromlangchain_community.embeddingsimportHuggingFaceEmbeddingsfromlangchain_community.vectorstoresimportChromafromlangchain_core.language_models.baseimportBaseLanguageModel# For Dummy LLM type hintfromlangchain_core.outputsimportGeneration, LLMResultfromlangchain_core.callbacksimportCallbackManagerForLLMRun, AsyncCallbackManagerForLLMRun, Callbacksfromlangchain_core.prompt_valuesimportPromptValue, StringPromptValuefromlangchain_core.runnablesimportRunnableConfig# Dummy LLM (Replace with OpenAI, HuggingFace, etc.)classDummyRunnable(BaseLanguageModel): def_generate(self, prompts: list[str], stop: list[str] |None=None, run_managers: list[CallbackManagerForLLMRun] |None=None, **kwargs) ->LLMResult: generations= [] fori, promptinenumerate(prompts): text=f"Dummy response to: {prompt.split()[-1]}..."gen= [Generation(text=text)] generations.append(gen) ifrun_managersandrun_managers[i]: run_managers[i].on_llm_end(LLMResult(generations=[gen])) returnLLMResult(generations=generations) asyncdef_agenerate(self, prompts: list[str], stop: list[str] |None=None, run_managers: list[AsyncCallbackManagerForLLMRun] |None=None, **kwargs) ->LLMResult: # Simplified async version for examplereturnself._generate(prompts, stop, None, **kwargs) # Non-async managers for simplicity heredefgenerate_prompt(self, prompts: list[PromptValue], stop: list[str] |None=None, callbacks: Callbacks=None, **kwargs) ->LLMResult: prompt_strings= [str(p) forpinprompts] # Simplified manager handling for dummyreturnself._generate(prompt_strings, stop=stop, **kwargs) asyncdefagenerate_prompt(self, prompts: list[PromptValue], stop: list[str] |None=None, callbacks: Callbacks=None, **kwargs) ->LLMResult: prompt_strings= [str(p) forpinprompts] # Simplified manager handling for dummyreturnawaitself._agenerate(prompt_strings, stop=stop, **kwargs) @propertydef_llm_type(self) ->str: return"dummy"# --- Configuration ---DEVICE="cuda"iftorch.cuda.is_available() else"cpu"EMBEDDING_DIM=384# Example dimension for MiniLM# 1. Embedding Modelembedding_model=HuggingFaceEmbeddings( model_name="sentence-transformers/all-MiniLM-L6-v2", model_kwargs={"device": DEVICE} ) # 2. Vector Storevectorstore=Chroma( collection_name="nmret_readme_example", embedding_function=embedding_model, persist_directory="./chroma_db_readme_example", ) # Add some initial data if neededvectorstore.add_texts( ["Initial context document 1.", "Another piece of information."], metadatas=[{"source": "readme", "memory_id": f"readme_{i}"} foriinrange(2)], ) # 3. Titans Neural Memory Wrappertitans_wrapper=TitansNeuralMemoryWrapper( embedding_dim=EMBEDDING_DIM, device=DEVICE, momentum=True, # Control momentum usage# Pass other NeuralMemory args directly, e.g.:layers=1, heads=2, # Ensure embedding_dim (384) is divisible by heads (2)chunk_size=128, use_accelerated_scan=True# Set to False if assoc-scan not installed ) # 4. Contextual Memory Wrappercontextual_memory=VectorStoreContextualMemory( vectorstore=vectorstore, embedding_model=embedding_model ) # 5. LLMllm=DummyRunnable() # Replace with your actual LLM instance# 6. LightThinker Compressorcompressor=LightThinkerCompressor(llm=llm, embedding_model=embedding_model) # 7. Create the Retrieverretriever=NeuralMemoryRetriever( vectorstore=vectorstore, neural_memory=titans_wrapper, contextual_memory=contextual_memory, compressor=compressor, llm=llm, embedding_model=embedding_model, device=DEVICE, # Configurationreasoning_steps=2, # Number of reasoning loopstop_k_initial=3, # K for initial vector store searchtop_k_contextual=2, # K for contextual memory search per stepcompress_intermediate=True,# Enable LLM thought compressionupdate_memory_on_final=False, # Update Titans memory after each stepupdate_titans_with="docs_and_llm", # Data source for Titans update  ) # --- Run a Query ---query="What is the main topic discussed?"print(f"Invoking retriever with query: '{query}'") results=retriever.invoke(query) print("--- Retrieval Complete ---") print(f"Final Documents Returned: {len(results)}") fori, docinenumerate(results): doc_id=doc.metadata.get("memory_id", "N/A") content_preview=doc.page_content[:100] print(f" - Doc {i}: ID: {doc_id}, Content: {content_preview}...") # Example of saving/loading state (if needed)# state = titans_wrapper.get_state()# # ... save state ...# # ... load state ...# titans_wrapper.load_state(loaded_state)

Important Notes - Limitations

  • Metadata Sanitization: When adding documents via VectorStoreContextualMemory, metadata values are sanitized for compatibility with ChromaDB. It's recommended to use simple data types (string, integer, float, boolean) for metadata values. Lists and other complex types will be converted to string representations.
  • Titans NeuralMemory Arguments: The TitansNeuralMemoryWrapper now accepts NeuralMemory arguments (like layers, heads, chunk_size, etc.) directly as keyword arguments (**kwargs) instead of through a separate dictionary.

Dependencies

Key Python packages:

  • langchain-core: For core LangChain abstractions.
  • titans-pytorch: The neural memory engine. (Ensure it's installed correctly)
  • torch: PyTorch library.
  • numpy: Numerical operations.
  • langchain-chroma / chromadb: Example vector store. (Or your chosen VectorStore package)
  • langchain-community / sentence-transformers: Example embeddings. (Or your chosen Embeddings package)
  • langchain-openai / langchain-anthropic / etc.: For the BaseLanguageModel used in the compressor and reasoning steps.
  • uv: For package management (optional, but used in pyproject.toml).
  • assoc-scan: Optional, for accelerated titans-pytorch operations.

See pyproject.toml and uv.lock for specific versions.

License

Uses the Apache 2.0 License as per the LICENSE file.

About

An unofficial LangChain based Retriever built using concepts from the Titans Neural Memory

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python100.0%