A Python slugify application that handles unicode.
Best attempt to create slugs from unicode strings while keeping it DRY.
This module, by default installs and uses text-unidecode(GPL & Perl Artistic) for its decoding needs.
However, there is an alternative decoding package called Unidecode(GPL). It can be installed as python-slugify[unidecode] for those who prefer it. Unidecode is believed to be more advanced.
| Python | Slugify |
|---|---|
>= 2.7 < 3.6 | < 5.0.0 |
>= 3.6 < 3.7 | >= 5.0.0 < 7.0.0 |
>= 3.7 | >= 7.0.0 |
pip install python-slugify # OR pip install python-slugify[unidecode] defslugify( text: str, entities: bool=True, decimal: bool=True, hexadecimal: bool=True, max_length: int=0, word_boundary: bool=False, separator: str=DEFAULT_SEPARATOR, save_order: bool=False, stopwords: Iterable[str] = (), regex_pattern: str|None=None, lowercase: bool=True, replacements: Iterable[Iterable[str]] = (), allow_unicode: bool=False, ) ->str: """ Make a slug from the given text. :param text (str): initial text :param entities (bool): converts html entities to unicode (foo & bar -> foo-bar) :param decimal (bool): converts html decimal to unicode (Ž -> Ž -> z) :param hexadecimal (bool): converts html hexadecimal to unicode (Ž -> Ž -> z) :param max_length (int): output string length :param word_boundary (bool): truncates to end of full words (length may be shorter than max_length) :param save_order (bool): when set, does not include shorter subsequent words even if they fit :param separator (str): separator between words :param stopwords (iterable): words to discount :param regex_pattern (str): regex pattern for disallowed characters :param lowercase (bool): activate case sensitivity by setting it to False :param replacements (iterable): list of replacement rules e.g. [['|', 'or'], ['%', 'percent']] :param allow_unicode (bool): allow unicode characters :return (str): slugify text """fromslugifyimportslugifytxt="This is a test ---"r=slugify(txt) self.assertEqual(r, "this-is-a-test") txt='影師嗎'r=slugify(txt) self.assertEqual(r, "ying-shi-ma") txt='影師嗎'r=slugify(txt, allow_unicode=True) self.assertEqual(r, "影師嗎") txt='C\'est déjà l\'été.'r=slugify(txt) self.assertEqual(r, "c-est-deja-l-ete") txt='Nín hǎo. Wǒ shì zhōng guó rén'r=slugify(txt) self.assertEqual(r, "nin-hao-wo-shi-zhong-guo-ren") txt='Компьютер'r=slugify(txt) self.assertEqual(r, "kompiuter") txt='jaja---lol-méméméoo--a'r=slugify(txt, max_length=9) self.assertEqual(r, "jaja-lol") txt='jaja---lol-méméméoo--a'r=slugify(txt, max_length=15, word_boundary=True) self.assertEqual(r, "jaja-lol-a") txt='jaja---lol-méméméoo--a'r=slugify(txt, max_length=20, word_boundary=True, separator=".") self.assertEqual(r, "jaja.lol.mememeoo.a") txt='one two three four'r=slugify(txt, max_length=12, word_boundary=True, save_order=False) self.assertEqual(r, "one-two-four") txt='one two three four'r=slugify(txt, max_length=12, word_boundary=True, save_order=True) self.assertEqual(r, "one-two") txt='the quick brown fox jumps over the lazy dog'r=slugify(txt, stopwords=['the']) self.assertEqual(r, 'quick-brown-fox-jumps-over-lazy-dog') txt='the quick brown fox jumps over the lazy dog in a hurry'r=slugify(txt, stopwords=['the', 'in', 'a', 'hurry']) self.assertEqual(r, 'quick-brown-fox-jumps-over-lazy-dog') txt='thIs Has a stopword Stopword'r=slugify(txt, stopwords=['Stopword'], lowercase=False) self.assertEqual(r, 'thIs-Has-a-stopword') txt="___This is a test___"regex_pattern=r'[^-a-z0-9_]+'r=slugify(txt, regex_pattern=regex_pattern) self.assertEqual(r, "___this-is-a-test___") txt="___This is a test___"regex_pattern=r'[^-a-z0-9_]+'r=slugify(txt, separator='_', regex_pattern=regex_pattern) self.assertNotEqual(r, "_this_is_a_test_") txt='10 | 20 %'r=slugify(txt, replacements=[['|', 'or'], ['%', 'percent']]) self.assertEqual(r, "10-or-20-percent") txt='ÜBER Über German Umlaut'r=slugify(txt, replacements=[['Ü', 'UE'], ['ü', 'ue']]) self.assertEqual(r, "ueber-ueber-german-umlaut") txt='i love 🦄'r=slugify(txt, allow_unicode=True) self.assertEqual(r, "i-love") txt='i love 🦄'r=slugify(txt, allow_unicode=True, regex_pattern=r'[^🦄]+') self.assertEqual(r, "🦄")For more examples, have a look at the test.py file.
With the package, a command line tool called slugify is also installed.
It allows convenient command line access to all the features the slugify function supports. Call it with -h for help.
The command can take its input directly on the command line or from STDIN (when the --stdin flag is passed):
$ echo "Taking input from STDIN" | slugify --stdin taking-input-from-stdin $ slugify taking input from the command line taking-input-from-the-command-line Please note that when a multi-valued option such as --stopwords or --replacements is passed, you need to use -- as separator before you start with the input:
$ slugify --stopwords the in a hurry -- the quick brown fox jumps over the lazy dog in a hurry quick-brown-fox-jumps-over-lazy-dog To run the tests against the current environment:
python test.py Please read the (wiki) page prior to raising any PRs.
Released under a (MIT) license.
Though the dependencies may be GPL licensed, python-slugify itself is not considered a derivative work and will remain under the MIT license.
If you wish to avoid installation of any GPL licensed packages, please note that the default dependency text-unidecode explicitly lets you choose to use the Artistic License instead. Use without concern.
X.Y.Z Version
`MAJOR` version -- when you make incompatible API changes, `MINOR` version -- when you add functionality in a backwards-compatible manner, and `PATCH` version -- when you make backwards-compatible bug fixes.