� ���g�$� �&�ddlZddlZddlmZddlmZddlmZddlm Z m Z m Z ddl Z ddl mZddlmZdd lmZdd lmZmZdd lmZdd lmZee��ZGd �de j��Zdedee eeffd�Z Gd�de!ee!ee ff��Z"idg�dg�dg�dg�dg�dg�dg�dg�dg�dg�dg�dg�d g�d!g�d"g�d#g�d$g�id%g�d&g�d'g�d(g�d)g�d*g�d+g�d,g�d-g�d.g�d/g�d0g�d1g�d2g�d3g�d4g�d5g��gggd6��Z#dS)7�N)�Counter)�groupby)� itemgetter)�Any�ClassVar�Optional)�DatasetCardData�)�METADATA_CONFIGS_FIELD)�Features)� DatasetInfo�DatasetInfosDict)� _split_re)� get_loggerc�&��eZdZd�Zd�fd� Z�xZS)�_NoDuplicateSafeLoaderc�����fd�|jD��}d�|D��}t|����fd��D��}|rtd|�����dS)Nc�0��g|]\}}�j|��S�)�constructed_objects)�.0�key_node�_�selfs ��g/home/asafur/pinokio/api/open-webui.git/app/env/lib/python3.11/site-packages/datasets/utils/metadata.py� <listcomp>zS_NoDuplicateSafeLoader._check_no_duplicates_on_constructed_node.<locals>.<listcomp>s%���Q�Q�Q�{�x���(��2�Q�Q�Q�c�Z�g|](}t|t��rt|��n|��)Sr)� isinstance�list�tuple)r�keys rrzS_NoDuplicateSafeLoader._check_no_duplicates_on_constructed_node.<locals>.<listcomp>s1��M�M�M��j��d�3�3�<��c� � � ��M�M�Mrc�,��g|]}�|dk�|��S)�r)rr"�counters �rrzS_NoDuplicateSafeLoader._check_no_duplicates_on_constructed_node.<locals>.<listcomp>s'���E�E�E�#�G�C�L�1�4D�4D�#�4D�4D�4DrzGot duplicate yaml keys: )�valuer� TypeError)r�node�keys�duplicate_keysr%s` @r�(_check_no_duplicates_on_constructed_nodez?_NoDuplicateSafeLoader._check_no_duplicates_on_constructed_nodes�����Q�Q�Q�Q�d�j�Q�Q�Q��M�M��M�M�M���$�-�-��E�E�E�E��E�E�E�� � J��H��H�H�I�I� I� J� JrFc�x��t���||���}|�|��|S)N)�deep)�super�construct_mappingr+)rr(r-�mapping� __class__s �rr/z(_NoDuplicateSafeLoader.construct_mappings8����'�'�+�+�D�t�+�<�<�� �5�5�d�;�;�;��r)F)�__name__� __module__� __qualname__r+r/� __classcell__)r1s@rrrsO�������J�J�J����������rr�readme_content�returnc�d�t|�����}|rw|ddkrkd|dd�vr_|dd��d��dz}d�|d|���}|d�||dzd���fSdd�|��fS)Nrz---r$� )r � splitlines�index�join)r6� full_content�sep_idx� yamlblocks r�_split_yaml_from_readmer@$s�����1�1�3�3�4�4�L��A� �Q��5�0�0�U�l�1�2�2�>N�5N�5N��q�r�r�"�(�(��/�/�!�3���I�I�l�1�W�9�5�6�6� ��$�)�)�L��1����$?�@�@�@�@� ����<�(�(� (�(rc ���eZdZUdZeZeeed<e de fd���Z e dede e eefdeddfd ���Ze d eddfd ���Zd edd fd �Zdeefd�Zd S)�MetadataConfigsz5Should be in format {config_name: {**config_params}}.� FIELD_NAME�metadata_configc�D�|�d��}|��tjd|�d���}t|tt f��st |���t|t��r�|D]�}t|t tf��r{t|t��rut|��dkrSd|vrOtj t|d��r/t|�d��t tf��st |�����dSdSdS)N� data_filesz� Expected data_files in YAML to be either a string or a list of strings or a list of dicts with two keys: 'split' and 'path', but got a� Examples of data_files in YAML: data_files: data.csv data_files: data/*.png data_files: - part0/* - part1/* data_files: - split: train path: train/* - split: test path: test/* data_files: - split: train path: - train/part1/* - train/part2/* - split: test path: test/* PS: some symbols like dashes '-' are not allowed in split names r �split�path) �get�textwrap�dedentrr �str� ValueError�dict�len�re�matchr)rD�yaml_data_files�yaml_error_message�yaml_data_files_items r�$_raise_if_data_files_field_not_validz4MetadataConfigs._raise_if_data_files_field_not_valid3sK��)�-�-�l�;�;�� � &�!)���O^����"�"� �>�o��c�{�;�;� 5� �!3�4�4�4��/�4�0�0� =�,;� =� =�(�&�';�c�4�[�I�I� =�%�&:�D�A�A� =� � 4�5�5��:�:� '�+?� ?� ?� "���4H��4Q� R� R�!@� *�+?�+C�+C�F�+K�+K�c�SW�[� Y� Y�!@� )�);�<�<�<��] '� &�D =� =� =� =r�parquet_commit_hash�exported_parquet_files� dataset_infosr7c�������fd�t|td����D����r �fd�����D���|���S)Nc ����i|]f\}}|�fd�t|td����D��t��|t ����jpd��d���gS)c�8��g|]\}}|�fd�|D��d���S)c�H��g|]}|d�d�����S)�urlzrefs%2Fconvert%2Fparquet)�replace)r� parquet_filerVs �rrzhMetadataConfigs._from_exported_parquet_files_and_dataset_infos.<locals>.<dictcomp>.<listcomp>.<listcomp>rs@���!�!�!� ,�)��/�7�7�8R�Tg�h�h�!�!�!r)rGrHr)r� split_name�parquet_files_for_splitrVs �rrz]MetadataConfigs._from_exported_parquet_files_and_dataset_infos.<locals>.<dictcomp>.<listcomp>os`��� � � �<� �$;� ",�!�!�!�!�0G�!�!�!��� � � rrGz0.0.0�rF�version)rrrLrIr rc)r� config_name�parquet_files_for_configrXrVs ��r� <dictcomp>zRMetadataConfigs._from_exported_parquet_files_and_dataset_infos.<locals>.<dictcomp>ms���� � � �6� �5� � � � � �@G�G_�ak�ls�at�at�?u�?u� � � ��}�0�0��k�m�m�L�L�T�_�X_�`�`� � � � � r�configc�`���i|])\�}���fd�|jD����dd���*S)c�N��g|]!}��dD]}|d|k�|���"S)rFrGr)rr`� data_filerd�metadata_configss ��rrz]MetadataConfigs._from_exported_parquet_files_and_dataset_infos.<locals>.<dictcomp>.<listcomp>�sT���#�#�#�&�)9�+�)F�|�)T�#�#�&�$�W�-��;�;�"�<�;�;�;rrcrb)�splits)r� dataset_infordrks @�rrfzRMetadataConfigs._from_exported_parquet_files_and_dataset_infos.<locals>.<dictcomp>sw���� � � �.�K���#�#�#�#�#�*6�*=�#�#�#� 0� �<�Y�G��� � � r)rr�items)�clsrVrWrXrks ` `@r�._from_exported_parquet_files_and_dataset_infosz>MetadataConfigs._from_exported_parquet_files_and_dataset_infosfs������ � � � � �:A�AW�Yc�dl�Ym�Ym�9n�9n� � � �� � � � � � �2?�1D�1D�1F�1F� � � � ��s�#�$�$�$r�dataset_card_datac�V��|�|j��r�||j}t|t��st d|j�d|�d����|D].}d|vrt d|�d����|�|���/|�fd�|D����S|��S)Nz Expected z to be a list, but got '�'rdzUEach config must include `config_name` field with a string name of a config, but got z. c���i|]J}|���x����d��d�����D����KS)rdc�N�i|]"\}}||dkr|ntj|����#S)�features)r �_from_yaml_list)r�paramr&s rrfzEMetadataConfigs.from_dataset_card_data.<locals>.<dictcomp>.<dictcomp>�sH��0�0�0�(�E�5����(;�(;�u�u��AY�Z_�A`�A`�0�0�0r)�copy�poprn)rrDrgs �rrfz:MetadataConfigs.from_dataset_card_data.<locals>.<dictcomp>�sr������ (�"1�"6�"6�"8�"8�8�� ��J�J�}�-�-�0�0�,2�L�L�N�N�0�0�0���r)rIrCrr rMrU)rorqrkrDrgs @r�from_dataset_card_dataz&MetadataConfigs.from_dataset_card_data�s��� � � ��� 0� 0� �0���@� ��.��5�5� j� �!h�S�^�!h�!h�Ue�!h�!h�!h�i�i�i�#3� J� J�� ��7�7�$�7�#2�7�7�7�����8�8��I�I�I�I��3����� ,<� ��� � � ��s�u�u� rNc��|r�|���D]}|�|���|�|��}tt i|�|��������}|���D]\}}|�dd���d�|���D��||j<dSdS)Nrdc� �g|] \}}d|i|��� S)rdr)rrd�config_metadatas rrz8MetadataConfigs.to_dataset_card_data.<locals>.<listcomp>�s6��2�2�2�0�K��� �?��?�2�2�2r)�valuesrUr{rN�sortedrnrzrC)rrqrD�current_metadata_configs�total_metadata_configsrdr~s r�to_dataset_card_dataz$MetadataConfigs.to_dataset_card_data�s��� � �#'�;�;�=�=� K� K���9�9�/�J�J�J�J�'+�'B�'B�CT�'U�'U� $�%)�&�1U�4L�1U�PT�1U�1[�1[�1]�1]�*^�*^�%_�%_� "�0F�0L�0L�0N�0N� 9� 9�,� �_��#�#�M�4�8�8�8�8�2�2�4J�4P�4P�4R�4R�2�2�2� �d�o� .� .� .� � rc���d}|���D]N\}}t|��dks|dks|�d��r|�|}�8td|�d|�d�����O|S)Nr$�defaultz&Dataset has several default configs: 'z' and 'z'.)rnrOrIrM)r�default_config_namerdrDs r�get_default_config_namez'MetadataConfigs.get_default_config_name�s���"��,0�J�J�L�L� � � (�K���4�y�y�A�~�~�� �!9�!9�_�=P�=P�QZ�=[�=[�!9�&�.�*5�'�'�$�l�AT�l�l�]h�l�l�l���� ":�#�"r)r2r3r4�__doc__r rCrrL�__annotations__� staticmethodrNrU� classmethodr rrrpr r{r�rr�rrrrBrB.s�������?�?� 6�J��� �6�6�6��0=�d�0=�0=�0=��\�0=�d�$%� �$%�!%�T�#�s�(�^� 4�$%�(� $%� � $%�$%�$%��[�$%�L����K\�����[��0 �o� �$� � � � � #��#�� #� #� #� #� #� #rrBzimage-classification� translationzimage-segmentationz fill-maskzautomatic-speech-recognitionztoken-classificationzsentence-similarityzaudio-classificationzquestion-answering� summarizationzzero-shot-classificationz table-to-textzfeature-extraction�otherzmultiple-choiceztext-classificationz text-to-imageztext2text-generationzzero-shot-image-classificationztabular-classificationztabular-regressionzimage-to-imageztabular-to-textzunconditional-image-generationztext-retrievalztext-to-speechzobject-detectionzaudio-to-audioztext-generation�conversationalztable-question-answeringzvisual-question-answeringz image-to-textzreinforcement-learning)zvoice-activity-detectionztime-series-forecastingzdocument-question-answering)$rPrJ� collectionsr� itertoolsr�operatorr�typingrrr�yaml�huggingface_hubr rgr rvr �infor r�namingr� utils.loggingrr2�logger� SafeLoaderrrLr!r@rNrB�known_task_idsrrr�<module>r�sc�� � � � �����������������������*�*�*�*�*�*�*�*�*�*� � � � �+�+�+�+�+�+�+�+�+�+�+�+�������0�0�0�0�0�0�0�0�������&�&�&�&�&�&� ��H� � �� � � � � �T�_� � � �)�C�)�E�(�3�-��:L�4M�)�)�)�)�O#�O#�O#�O#�O#�d�3��S�#�X��.�/�O#�O#�O#�j&��B�&��2�&��"�&��� &� #�B� &� �B� &��2�&��B�&��"�&��R�&���&��R�&��"�&� �R�&��r�&� �2�!&�"�R�#&�&�$�B�%&�&%�b�'&�(�b�)&�*�"�+&�,�b�-&�.�r�/&�0%�b�1&�2�b�3&�4�b�5&�6��7&�8�b�9&�:�r�;&�<�b�=&�>��?&�@ ��A&�B�R�C&�D�b�E&�&�F!#�!�#%�K&�&�&���r
Memory