� ���g ���dZddlmZddlmZdeeeefdefd�ZGd�de ��Z Gd �d e ��Z Gd �d ��Z d S)a Hashing function for dataset keys using `hashlib.md5` Requirements for the hash function: - Provides a uniformly distributed hash from random space - Adequately fast speed - Working with multiple input types (in this case, `str`, `int` or `bytes`) - Should be platform independent (generates same hash on different OS and systems) The hashing function provides a unique 128-bit integer hash of the key provided. The split name is being used here as the hash salt to avoid having same hashes in different splits due to same keys �)�Union)�insecure_hashlib� hash_data�returnc��t|t��r|St|t��r|�dd��}n4t|t��rt|��}nt |���|�d��S)z| Returns the input hash_data in its bytes form Args: hash_data: the hash salt/key to be converted to bytes �\�/zutf-8)� isinstance�bytes�str�replace�int�InvalidKeyError�encode)rs �`/home/asafur/pinokio/api/open-webui.git/app/env/lib/python3.11/site-packages/datasets/keyhash.py� _as_bytesr&s����)�U�#�#� )��� �I�s� #� #�)��%�%�d�C�0�0� � � �I�s� #� #�)�� �N�N� � ��i�(�(�(� � � �G� $� $�$�c�"��eZdZdZ�fd�Z�xZS)rz6Raises an error when given key is of invalid datatype.c����d|_d|�dt|����|_d|_t ���|j�|j�|j����dS)Nz7 FAILURE TO GENERATE DATASET: Invalid key type detectedz Found Key z of type z- Keys should be either str, int or bytes type)�prefix�type�err_msg�suffix�super�__init__)�selfr� __class__s �rrzInvalidKeyError.__init__@sb���P�� �K�i�K�K�$�y�/�/�K�K�� �F�� � �����D�K�D���D�t�{�D�D�E�E�E�E�Er��__name__� __module__� __qualname__�__doc__r� __classcell__�rs@rrr=sG�������@�@�F�F�F�F�F�F�F�F�Frrc�$��eZdZdZd�fd� Z�xZS)�DuplicatedKeysErrorz(Raise an error when duplicate key found.�c���||_||_||_d|_t |��dkr!dd�|���d|��|_n;dd�|dd����dt |��dz �d|��|_|rd|znd |_t��� |j�|j�|j����dS) Nz3Found multiple examples generated with the same key�z The examples at index z, z have the key z... (z more) have the key � r') �key�duplicate_key_indices�fix_msgr�len�joinrrrr)rr+r,r-rs �rrzDuplicatedKeysError.__init__Js ������%:��"��� �K�� � �$� %� %�� +� +�k�d�i�i�@U�6V�6V�k�k�fi�k�k�D�L�L�]�d�i�i�@U�VY�WY�VY�@Z�6[�6[�]�]�be�f{�b|�b|�@B�cB�]�]�X[�]�]�D�L�(/�7�d�W�n�n�R�� � �����D�K�D���D�t�{�D�D�E�E�E�E�Er)r'rr$s@rr&r&GsM�������2�2� F� F� F� F� F� F� F� F� F� Frr&c�@�eZdZdZdefd�Zdeeeefdefd�Z dS)� KeyHasherz,KeyHasher class for providing hash using md5� hash_saltc�R�tjt|����|_dS)N)r�md5r� _split_md5)rr2s rrzKeyHasher.__init__Zs ��*�.�y��/C�/C�D�D����rr+rc���|j���}t|��}|�|��t |���d��S)z�Returns 128-bits unique hash of input key Args: key: the input key to be hashed (should be str, int or bytes) Returns: 128-bit int hash key�)r5�copyr�updater� hexdigest)rr+r4�byte_keys r�hashzKeyHasher.hash]sL���o�"�"�$�$���S�>�>�� � � �8�����3�=�=�?�?�B�'�'�'rN) rr r!r"r rrrr r<�rrr1r1Wsg������6�6�E�#�E�E�E�E� (��c�3��o�.� (�3� (� (� (� (� (� (rr1N) r"�typingr�huggingface_hub.utilsrr rr r� Exceptionrr&r1r=rr�<module>rAs���"�� ������2�2�2�2�2�2�%��s�C���/�%�E�%�%�%�%�.F�F�F�F�F�i�F�F�F� F� F� F� F� F�)� F� F� F� (�(�(�(�(�(�(�(�(�(r
Memory