llm-components is a Python library designed to feed large language models with data from different sources.
The library formats content in a structured markdown format.
Code base formatting:
- Traverse a directory tree while respecting
.gitignorerules. - Format the directory structure and file contents into a readable markdown format.
- Clone a git repository and format its contents.
- Traverse a directory tree while respecting
Convert web pages to markdown format
To install the llm-components library, you can use pip:
pip install llm-componentsYou can use the command-line interface to map a code base to markdown. The CLI takes the root directory of the code base or a git repository URL as an argument.
llm-components format-codebase <root_dir_or_repo>You can also convert a web page to markdown using the CLI:
llm-components web-to-markdown <url># For a local directory llm-components format-codebase /path/to/your/code/base # For a git repository llm-components format-codebase https://github.com/your/repo.git # For a web page llm-components web-to-markdown https://advanced-stack.comThis will output the directory structure and file contents in a structured markdown format.
You can also use the library programmatically by importing the necessary functions.
importtempfilefrompathlibimportPathfromllm_components.loaders.code_baseimportmap_codebase_to_textfromllm_components.loaders.git_utilsimportclone_repositoryfromllm_components.loaders.web_to_markdownimportretrieve_and_convert# For a local directoryroot_dir=Path("/path/to/your/code/base") result=map_codebase_to_text(root_dir) print(result) # For a git repositoryrepo_url="https://github.com/your/repo.git"withtempfile.TemporaryDirectory() astemp_dir: clone_dir=Path(temp_dir) /"repo"clone_repository(repo_url, clone_dir) result=map_codebase_to_text(clone_dir) print(result) # For a web pageurl="https://example.com"markdown_content=retrieve_and_convert(url) print(markdown_content)This project is licensed under the MIT License.