� L�g�.� ���dZddlmZddlZddlZddlZddlZddlmZgd�Z ej d��Z ej d��Z dMd �Z e ��ZdNd �Zej d ��Ze��ZdOd�ZdPd�Ze��Zej d��Zej d��Zej d��Zied��d�ed��d�ed��d�ed��d�ed ��d!�ed"��d#�ed$��d%�ed&��d'�ed(��d)�ed*��d+�ed,��d-�ed.��d/�ed0��d1�ed2��d3�ed4��d5�ed6��d7�ed8��d9�ed:��d;ed<��d=ed>��d?ed@��dAedB��dCi�ZdQdE�Ze��ZdFdGdHdIdJdK�Zej dLjdRie��ej��ZdS)Szi This gives other modules access to the gritty details about characters and the encodings that use them. �)� annotationsN)�Dict) zlatin-1zsloppy-windows-1252zsloppy-windows-1251zsloppy-windows-1250zsloppy-windows-1253zsloppy-windows-1254z iso-8859-2�macroman�cp437u [ʼ‘-‛]u [“-‟]�return�Dict[str, re.Pattern[str]]c �$�dtjd��i}tD]q}tt t dd����dgz��}|�|��}d�|��}tj|��||<�r|S)a ENCODING_REGEXES contain reasonably fast ways to detect if we could represent a given string in a given encoding. The simplest one is the 'ascii' detector, which of course just determines if all characters are between U+0000 and U+007F. �asciiz^[-]*$���z^[--{0}]*$)�re�compile�CHARMAP_ENCODINGS�bytes�list�range�decode�format)�encoding_regexes�encoding� byte_range�charlist�regexs �]/home/asafur/pinokio/api/open-webui.git/app/env/lib/python3.11/site-packages/ftfy/chardata.py�_build_regexesr s��� ���,<�!=�!=�>��%� 7� 7���4��d�E� 2� 2�3�3�t�f�<�=�=� ��$�$�X�.�.��-�3�3�H�=�=��%'�Z��%6�%6���"�"� ���Dict[str, str]c�V�i}tjj���D]�\}}|�d��rh||d|z<||���krH|���}d|z}tj|��|kr|���||<��|S)N�;�&)�html�entities�html5�items�endswith�lower�upper�unescape)r#�name�char� name_upper� entity_uppers r�_build_html_entitiesr.>s����H��m�)�/�/�1�1� :� :� ��d� �=�=�� � � :�#'�H�S�4�Z� � �t�z�z�|�|�#�#�!�Z�Z�\�\� �"�Z�/� ��=��.�.�,�>�>�-1�Z�Z�\�\�H�\�*�� �Orz&#?[0-9A-Za-z]{1,24};�text�strr�boolc�\�tt|�|����S)z� Given text and a single-byte encoding, check whether that text could have been decoded from that single-byte encoding. In other words, check whether it can be encoded in that encoding, possibly sloppily. )r1�ENCODING_REGEXES�match)r/rs r�possible_encodingr5Vs&�� � ��*�0�0��6�6� 7� 7�7r�Dict[int, None]c ���i}tjtdd��dgtdd��dgtdd��d gtd d ����D]}d ||<�|S) z� Build a translate mapping that strips likely-unintended control characters. See :func:`ftfy.fixes.remove_control_chars` for a description of these codepoint ranges and why they should be removed. r� � �� �ij ip i��i��i��N)� itertools�chainr)� control_chars�is r�_build_control_char_mappingrAas|�� &(�M� �_� �d�D��� �� �d�D��� �� �f�f��� �� �f�f�����  �  �� � �a��� �rse[������][ ]|[��][ ][�-��-��-�]|[�-�][�-��-��-�][ ]|[�][ ][�-�][�-�]|[�][�-�][ ][�-�]|[�][�-�][�-�][ ]s�[�-�][]|[�-�][?]|�[�-�][?]�[�-�][?�-�]|�[�-�][?�-�]�[�-�][?]|[�-�][?][�-�]|[�-�][�-�][?]|[�-�][?][�-�][�-�]|[�-�][�-�][?][�-�]|[�-�][�-�][�-�][?]|z [\x80-\x9f]uIJ�IJuij�ijuʼnuʼnuDZ�DZuDz�Dzudz�dzuDŽuDŽuDžuDžudžudžuLJ�LJuLj�Ljulj�ljuNJ�NJuNj�Njunj�njuff�ffufi�fiufl�fluffi�ffiuffl�ffluſtuſtust�st�Dict[int, str]c��ddi}tdd��D]1}t|��}tjd|��}||kr|||<�2|S)zt Build a translate mapping that replaces halfwidth and fullwidth forms with their standard-width forms. i0� i�i���NFKC)r�chr� unicodedata� normalize)� width_mapr@r+� alternates r�_build_width_mapr\�s^���� �I� �6�6� "� "�%�%���1�v�v���)�&�$�7�7� � �� � �$�I�a�L�� �ru�ÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖרÙÚÛÜÝÞßĂĆČĎĐĘĚĞİĹŃŇŐŘŞŢŮŰΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩΪΫάέήίВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯupàáâãäåæçèéêëìíîïăćčďęěĺŕΰαβγδεζηθικλμνξοабвгдежзийклмнопuðóđğπσруu�€-¿ĄąĽľŁłŒœŚśŞşŠšŤťŸŹźŻżŽžƒˆˇ˘˛˜˝΄΅ΆΈΉΊΌΎΏЁЂЃЄЅІЇЈЉЊЋЌЎЏёђѓєѕіїјљњћќўџҐґ–—―‘’‚“”„†‡•…‰‹›€№™ u�€-¿ĄąĽľŁłŒœŚśŞşŠšŤťŸŹźŻżŽžƒˆˇ˘˛˜˝΄΅ΆΈΉΊΌΎΏЁЂЃЄЅІЇЈЉЊЋЌЎЏёђѓєѕіїјљњћќўџҐґ†‡•‰‹›€№™)�utf8_first_of_2�utf8_first_of_3�utf8_first_of_4�utf8_continuation�utf8_continuation_strictz� (?<! [{utf8_continuation_strict}]) ( [{utf8_first_of_2}] [{utf8_continuation}] | [{utf8_first_of_3}] [{utf8_continuation}]{{2}} | [{utf8_first_of_4}] [{utf8_continuation}]{{3}} )+ )rr)rr)r/r0rr0rr1)rr6)rrS�) �__doc__� __future__rr"r=rrX�typingrrr�SINGLE_QUOTE_RE�DOUBLE_QUOTE_RErr3r.�HTML_ENTITY_RE� HTML_ENTITIESr5rA� CONTROL_CHARS�ALTERED_UTF8_RE� LOSSY_UTF8_RE� C1_CONTROL_RE�ord� LIGATURESr\� WIDTH_MAP� UTF8_CLUESr�VERBOSE�UTF8_DETECTOR_RErbrr�<module>rts����� #�"�"�"�"�"� � � � ����� � � � ����������� � � ���"�*�4�5�5���"�*�.�/�/������6"�>�#�#������(���4�5�5��$�$�&�&� �8�8�8�8�����,,�+�-�-� �P�"�*�(����$�� �  � � � � �� �>�*�*� � ��C��I�I�t� ��C��I�I�t� ��C��I�I�u� ��C��I�I�t�  � �C��I�I�t�  � �C��I�I�t�  ��C��I�I�u� ��C��I�I�u� ��C��I�I�u� ��C��I�I�t� ��C��I�I�t� ��C��I�I�t� ��C��I�I�t� ��C��I�I�t� ��C��I�I�t� � �C��J�J��! �"�C��J�J��# �$�C��J�J���C��J�J���C��J�J���C��J�J���C��J�J��- � � �4����" � � � � � A�K�+� � &�1�� �`�2�:�  � � � � � � � ��J�����r
Memory