A robust and feature-rich Dify plugin that converts HTML content to clean, formatted Markdown using multiple conversion methods.
This plugin has been successfully tested and is ready for deployment to Dify.
- Multiple Conversion Methods: Choose from 6 different HTML-to-Markdown conversion libraries
- High-Quality Output: Produces clean, well-formatted Markdown
- Robust Error Handling: Graceful fallbacks when conversion fails
- Configurable: Select the best conversion method for your use case
- Production Ready: Comprehensive testing and validation
- trafilatura (default) - Excellent for web content and articles
- markdownify - Clean, semantic conversion
- html2text - Simple and reliable
- pypandoc - Academic-grade conversion using Pandoc
- beautifulsoup - Custom conversion with BeautifulSoup
- simple - Basic fallback method
Install Dify CLI:
brew tap langgenius/dify brew install dify
Install Dependencies:
make install
Build the plugin package:
make dify-package
Test locally with Dify CLI:
./dify-cli-latest plugin run ./dist/html_to_markdown.difypkg --enable-logs
Run comprehensive tests:
python3 test_simple.py
✅ Plugin Loading: Successfully loads and responds to commands
✅ Tool Registration: html_markdown_converter tool properly registered
✅ Dependencies: All required libraries available
✅ Error Handling: Robust error handling and fallbacks
Sample test output:
🧪 Testing Plugin Loading ================================================== Plugin responses: info:{'info': 'loading plugin'} plugin_ready:{'info': 'plugin loaded'} ✅ Plugin loaded and responded Build the plugin (if not already done):
make dify-package
Upload to Dify:
- Open your Dify instance
- Go to Plugins section
- Upload
./dist/html_to_markdown.difypkg - Enable the plugin
Use in Workflows:
- Add the "HTML to Markdown" tool to your workflow
- Configure the conversion method
- Pass HTML content as input
Tool: HTML to MarkdownProvider: html_markdown_converterParameters: html_content: "<h1>Hello World</h1><p>This is <strong>bold</strong> text.</p>"conversion_method: "trafilatura"# Hello World This is **bold** text.- html_content (required): The HTML content to convert
- conversion_method (optional): Choose from:
trafilatura(default)markdownifyhtml2textpypandocbeautifulsoupsimple
- Plugin won't load: Check that all dependencies are installed
- Conversion fails: Try a different conversion method
- Poor output quality: Use
trafilaturaorpypandocfor better results
If you encounter dependency issues, install them manually:
pip install trafilatura markdownify html2text pypandoc beautifulsoup4For macOS with pandoc:
brew install pandocdify-html-to-markdown-plugin/ ├── manifest.yaml # Plugin manifest ├── main.py # Plugin entry point ├── provider/ # Provider configuration │ └── html_markdown_converter.yaml ├── tools/ # Tool implementation │ └── html_to_markdown.py ├── requirements.txt # Python dependencies ├── Makefile # Build automation └── dist/ # Built packages └── html_to_markdown.difypkg The plugin uses a comprehensive Makefile for development:
make help# Show all available commands make install # Install dependencies make validate # Validate configuration make dify-package # Build .difypkg file make clean # Clean build artifactsMultiple test approaches are available:
- Direct tool testing:
python3 test_simple.py - Plugin daemon testing: Using Dify CLI
- Integration testing: With actual Dify instance
The plugin uses a streamlined architecture that follows Dify's plugin framework:
- main.py: Minimal entry point that lets the framework handle initialization
- Tool Implementation: Clean, focused tool class with comprehensive error handling
- Multiple Backends: Six different conversion methods for maximum reliability
- ✅ Simplified main.py without complex Plugin/DifyPluginEnv creation
- ✅ Robust error handling with fallback methods
- ✅ Comprehensive logging for debugging
- ✅ Multiple conversion options for different use cases
- ✅ Production-ready configuration
- v0.1.0: Initial working version with multiple conversion methods
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly with
make test - Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
For issues or questions:
- Check the troubleshooting section
- Run tests to verify the issue
- Open a GitHub issue with test results
🎉 Your HTML to Markdown plugin is ready for production use!