�
���g�� � + �� � d Z ddlZddlZddlZddlZddlZddlZddlZddlZddl Z ddl
Z
ddlZddlm
Z
ddlmZmZ ddlmZ ddlmZmZ ddlmZ ddlmZmZmZ ddlZddlZddlZdd lm Z dd
l!m"Z"m#Z#m$Z$ ddl%m&Z&m'Z'm(Z(m)Z)m*Z*m+Z+m,Z, dd
l-m.Z.m/Z/ ddl0m1Z1 ddl2m3Z3m4Z4 ddl5m6Z6m7Z7m8Z8m9Z9m:Z:m;Z; ddl<m=Z=m>Z> ddl?m@Z@ ddlAmBZB ddlCmDZDmEZEmFZFmGZG ddlHmIZImJZJ ddlKmLZL ddlMmNZN ddlOmPZPmQZQ ddlRmSZS ddlTmUZUmVZV ddlWmXZXmYZYmZZZm[Z[m\Z\ ddl]m^Z^ ddl_m`Z` ddlambZb ddlcmdZdmeZemfZfmgZgmhZhmiZimjZj dd lkmlZl dd!lmmnZnmoZo dd"lpmqZq dd#lrmsZs dd$ltmuZumvZv dd%lwmxZx dd&lymzZz eqe{� � Z| e} eXj~ � � � � d'gz Zd(� Z�d)ee� d*e�d+e�fd,�Z�e/j� dfd-e�d.eeee�f fd/�Z�d+ee�e4 fd0�Z� G d1� d2� � Z�d3e�e4 d4e}e3 d5ee� d6e�d+e�e4 f
d7�Z� d�d8d9d6ee� d+e�e4 fd:�Z�d;e}e� d+e�fd<�Z�d-e�fd=�Z�d-e�d>e�d?e�e�e�e�e�f d@ee@ d+e�e}e�e�e�f e}e�e�e�f f f
dA�Z�d-e�dBe}e�e�e�f d+dfdC�Z�d-e�dDe�dEe�dFe�dGe}e�e�e�f dHe}e�e�e�f dIeeeBe�f d+e�fdJ�Z�dKe�dLe�dEe�d-e�d+e�f
dM�Z�dNe�dGe}e�e�e�f dHe}e�e�e�f dKe�dLe�dEe�d-e�dIeBd+dfdO�Z�dKe�dLe�dEe�d-e�d+e�e�e�f f
dP�Z� d�dQe7d@ee@ d+e�ee� e�f fdR�Z� d�dQe7d@ee@ d+e�ee� e�f fdS�Z� d�dTe6dUee� d@ee@ d+e�ee� e�e�ef f fdV�Z� d�dWe�dXesd>ee� dYe�e�ef d@ee@ d+e�e}e3 e�f fdZ�Z�e G d[� d\� � � � Z�e G d]� d9� � � � Z� G d^� d_� � Z� G d`� dae�� � Z� G db� dce�� � Z� G dd� dee�� � Z� G df� dge�� � Z� G dh� die�� � Z� G dj� dke�� � Z� G dl� dme�� � Z� d�dUe�dpeee�ezf d@ee@ dIeeeBe�f dKee� dqee� dTeee�e}e�e6f dree� d)ee� d+e�fds�Z� d�dUe�d-ee� dqee� dTeee�ee� ee�ee�ee� f f f dree� dteeL d@ee@ dIeeeBe�f dpeee�ezf dueee�e�f dvee� d)ee� d+e4fdw�Z� d�dUe�d-ee� dqee� dTeee�ee� ee�ee�ee� f f f dxeee�e`f dree� dteeL d@ee@ dIeeeBe�f dyeeene�f dzee� d{e�dpeee�ezf dueee�e�f d|e�d}ee� dvee� d)ee� d+ee=e1e>eSf f&d~�Z� d�dexdzee� dvee� d+ee1e=f fd��Z�dS )�zAccess datasets.� N)�Counter)�Mapping�Sequence)�nullcontext)� dataclass�field)�Path)�Any�Optional�Union)� url_to_fs)�DatasetCard�DatasetCardData�HfApi)�EntryNotFoundError�GatedRepoError�LocalEntryNotFoundError�OfflineModeIsEnabled�RepositoryNotFoundError�RevisionNotFoundError�get_session� )�__version__�config)�Dataset)�
BuilderConfig�DatasetBuilder)�
DataFilesDict�
DataFilesList�DataFilesPatternsDict�EmptyDatasetError�get_data_patterns�sanitize_patterns)�DatasetDict�IterableDatasetDict)�DownloadConfig)�DownloadMode)�StreamingDownloadManager� xbasename�xglob�xjoin)�DataFilesNotFoundError�DatasetNotFoundError)�Features)�Hasher)�DatasetInfo�DatasetInfosDict)�IterableDataset)�camelcase_to_snakecase�snakecase_to_camelcase)�_EXTENSION_TO_MODULE�_MODULE_TO_EXTENSIONS�_MODULE_TO_METADATA_FILE_NAMES�_PACKAGED_DATASETS_MODULES�_hash_python_lines)�FolderBasedBuilder)�Split)�_dataset_viewer)�!_raise_if_offline_mode_is_enabled�cached_path�get_datasets_user_agent�init_hf_modules�is_relative_path�relative_to_absolute_path�url_or_path_join)�hf_dataset_url)�VerificationMode�is_small_dataset)�
get_logger)�MetadataConfigs)�get_imports�lock_importable_file)�PathLike)�Version�.zipc � � t d� � �)Nz�Loading this dataset requires you to execute custom code contained in the dataset repository on your local machine. Please set the option `trust_remote_code=True` to permit loading of this dataset.)�
ValueError)�signum�frames �]/home/asafur/pinokio/api/open-webui.git/app/env/lib/python3.11/site-packages/datasets/load.py�_raise_timeout_errorrS i s � �
� e�� � � �trust_remote_code�repo_id�returnc �� � | �| nt j } | ��t j dk r� t j t j t
� � t j t j � � | �It d|� d|� d�� � }|� � � dv rd} n|� � � dv rd } | �It j d� � n4# t $ r t d|� d|� d
�� � �w xY wt dd� � | S )z�
Copied and adapted from Transformers
https://github.com/huggingface/transformers/blob/2098d343cc4b4b9d2aea84b3cf1eb5a1e610deff/src/transformers/dynamic_module_utils.py#L589
Nr �The repository for �� contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at https://hf.co/datasets/z�.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.
Do you wish to run the custom code? [y/N] )�yes�y�1T)�no�n�0� FzS.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.)r �HF_DATASETS_TRUST_REMOTE_CODE�TIME_OUT_REMOTE_CODE�signal�SIGALRMrS �alarm�input�lower� ExceptionrO )rU rV �answers rR �resolve_trust_remote_coderk p sX � �
.?�-J�)�)�PV�Pt��� ��&��*�*�
��
�f�n�.B�C�C�C���V�8�9�9�9�'�/�"�F�g� F� F�nu�F� F� F�� �F� �|�|�~�~�):�:�:�,0�)�)������+?�?�?�,1�)� (�/� ��Q�������
�
�
� �i�'� i� i�jq�i� i� i�� � �
����
!��t�,�,�,��s �B!C �!C'�name�hf_modules_cachec � � t |� � }t j � || � � }t j |d�� � t j � t j � |d� � � � sGt
t j � |d� � d� � 5 ddd� � n# 1 swxY w Y |S )aS
Create a module with name `name` in which you can add dynamic modules
such as datasets. The module can be imported using its name.
The module is created in the HF_MODULE_CACHE directory by default (~/.cache/huggingface/modules) but it can
be overridden by specifying a path to another directory in `hf_modules_cache`.
T��exist_ok�__init__.py�wN)r@ �os�path�join�makedirs�exists�open)rl rm �dynamic_modules_paths rR �init_dynamic_modulesrz � s� � � '�'7�8�8���7�<�<�(8�$�?�?���K�$�t�4�4�4�4�
�7�>�>�"�'�,�,�';�]�K�K�L�L� �
�"�'�,�,�3�]�C�C�S�
I�
I� � �� � � � � � � � � � � ���� � � � ��s �1B?�?C�Cc �, � t j | � � }d}|j � � � D ]c\ }}t j |� � rJt
|t � � r5t j |� � r�C|}t j |� � }|�||k r n�d|S )zJImport a module at module_path and return its main class: a DatasetBuilderN)
� importlib�
import_module�__dict__�items�inspect�isclass�
issubclassr �
isabstract� getmodule)�module_path�module�module_main_clsrl �obj�
obj_modules rR �import_main_classr� � s� � �
�
$�[�
1�
1�F��O��_�*�*�,�,� � � ��c��?�3��� �J�s�N�$C�$C� ��!�#�&�&�
��!�O� �*�3�/�/�J��%�&�J�*>�*>�����rT c � � e Zd ZdZd� ZdS )�#_InitializeConfiguredDatasetBuilderaL
From https://stackoverflow.com/questions/4647566/pickle-a-dynamically-parameterized-sub-class
See also ConfiguredDatasetBuilder.__reduce__
When called with the param value as the only argument, returns an
un-initialized instance of the parameterized class. Subsequent __setstate__
will be called by pickle.
c �R � t � � }t ||||�� � |_ |S )N)�default_config_name�dataset_name)r� �configure_builder_class� __class__)�self�builder_cls�metadata_configsr� rl r� s rR �__call__z,_InitializeConfiguredDatasetBuilder.__call__� s6 � �1�3�3��/��)�?R�ae�
�
�
��
� �
rT N)�__name__�
__module__�__qualname__�__doc__r� � rT rR r� r� � s- � � � � � �� �� � � � rT r� r� �builder_configsr� r� c �4 � ��� G � ��fd�d� � � }� j � � � � � � � t |� � � �|_ � j � � � � � � � t |� � � �|_ |S )z�
Dynamically create a builder class with custom builder configs parsed from README.md file,
i.e. set BUILDER_CONFIGS class variable of a builder class to custom configs list.
c �, �� e Zd Z�Z�Z� j Zd� ZdS )�9configure_builder_class.<locals>.ConfiguredDatasetBuilderc � � | j j d }t � � || j | j | j f| j � � � fS )Nr )r� �__mro__r� �BUILDER_CONFIGS�DEFAULT_CONFIG_NAMEr� r~ �copy)r� �parent_builder_clss rR �
__reduce__zDconfigure_builder_class.<locals>.ConfiguredDatasetBuilder.__reduce__� sR � �!%��!7��!:��3�5�5�&��(��,��%� � �
�"�"�$�$� �
rT N)r� r� r� r� r� r� )r� r� r� s ���rR �ConfiguredDatasetBuilderr� � s7 �� � � � � �)��1�� �+�
� � � � � rT r� )r� rh �
capitalizer4 r� )r� r� r� r� r� s ``` rR r� r� � s� ���� �� � � � � � � � �;� � � �( ��%�%�'�'�2�2�4�4�\�6L�\�6Z�6Z�\�\� �%� ��%�%�'�'�2�2�4�4�\�6L�\�6Z�6Z�\�\� �)� $�#rT �dataset_module�
DatasetModulec �p � | j rt | j � � n
t � � 5 t | j � � }d d d � � n# 1 swxY w Y | j j rT|p| j � d� � }|�t d� � �t || j j | j j |�� � }|S )Nr� z-dataset_name should be specified but got None)r� r� r� )�importable_file_pathrJ r r� r� �builder_configs_parametersr� �builder_kwargs�getrO r� r� )r� r� r� s rR �get_dataset_builder_classr� � s � �
�.� ��^�@�A�A�A�
�]�]�D� D�
(��(B�C�C��D� D� D� D� D� D� D� D� D� D� D���� D� D� D� D� �0�@�
�#�X�~�'D�'H�'H��'X�'X�����L�M�M�M�-��*�E�U� .� I� ]�%�
�
�
�� �s �A�A�A�
file_pathsc �� � g }| D ]y}t j � |� � rC|� t t |� � � d� � � � � � �d|� |� � �zg }|D ]R}t |d�� � 5 }|� |� � � � � ddd� � n# 1 swxY w Y �St |� � S )zt
Convert a list of scripts or text files provided in file_paths into a hashed filename in a repeatable way.
z
*.[pP][yY]�utf-8��encodingN)rs rt �isdir�extend�listr �rglob�appendrx � readlinesr9 )r� �to_use_files� file_path�lines�fs rR �
files_to_hashr� s � �
,.�L�� +� +� �
�7�=�=��#�#� +�����T�)�_�_�%:�%:�<�%H�%H� I� I�J�J�J�J���� �*�*�*�*�
�E�!� (� (� �
�)�g�
.�
.�
.� (�!��L�L������'�'�'� (� (� (� (� (� (� (� (� (� (� (���� (� (� (� (���e�$�$�$s �(C
�
C �C c � � t j sut j rk t � � � d� t j | | dz f� � dt � � id�� � dS # t $ r Y dS w xY wdS dS )z'Update the download count of a dataset.�/�.pyz
User-Agent� )�headers�timeoutN) r �HF_HUB_OFFLINE�HF_UPDATE_DOWNLOAD_COUNTSr �headru �S3_DATASETS_BUCKET_PREFIXr? ri )rl s rR �increase_load_countr� s� � �� � �V�%E� � ��M�M������&�:�D�$��,�O�P�P�%�'>�'@�'@�A��
�
�
�
�
�
��
� � � ��D�D� ����� � � s �AA1 �1
A?�>A?� base_path�imports�download_configc �� � g }g }|� � � }|j �d|_ |D ]�\ }}}} |dk r|� ||f� � �%|| k rt d| � d|� d|� d|� d� � � �|d k rt ||d
z � � }
n|dk r|}
nt d� � �t |
|�
� � }| � t j � || � � }|� ||f� � ��||fS )a�
Download additional module for a module <name>.py at URL (or local path) <base_path>/<name>.py
The imports must have been parsed first using ``get_imports``.
If some modules need to be installed with pip, an error is raised showing how to install them.
This function return the list of downloaded modules as tuples (import_name, module_file_path).
The downloaded modules can then be moved into an importable directory with ``_copy_script_and_other_resources_in_importable_dir``.
NzDownloading extra modules�libraryz
Error in the z script, importing relative z module but z: is the name of the script. Please change relative import zl to another name and add a '# From: URL_OR_PATH' comment pointing to the original relative import file path.�internalr� �externalzWrong import_type�r� ) r� �
download_descr� rO rC r>