� ���g�[��D�dZddlZddlZddlZddlZddlZddlmZddlmZm Z ddl m Z m Z ddl mZddlmZmZeGd �d ����ZeGd �d ����ZGd �dej���ZGd�de��ZGd�de���ZeZGd�de��ZGd�de��ZGd�de��ZGd�de��ZGd�d��Zejddd g��Z Gd!�d"��Z!Gd#�d$e"��Z#eGd%�d&����Z$dS)'zSplits related API.�N)� dataclass)�Optional�Union�)�FileInstructions�make_file_instructions)� _split_re)�NonMutableDict�asdictc� �eZdZUejdddi���Zeed<ejdddi���Ze ed<ejdddi���Z e ed<d Z e e e ed <ejd ddi���Ze eed <ed ���Zd S) � SplitInfo��$include_in_asdict_even_if_is_defaultT)�default�metadata�namer� num_bytes� num_examplesN� shard_lengths� dataset_namec�d�t|j|gt|j�����}|jS)�/Returns the list of dict(filename, take, skip).�r� split_infos� instruction)rr�strr�file_instructions)�self� instructionss �_/home/asafur/pinokio/api/open-webui.git/app/env/lib/python3.11/site-packages/datasets/splits.pyrzSplitInfo.file_instructions.s9��.��"����D�I��� � � � � �-�-�)�__name__� __module__� __qualname__� dataclasses�fieldrr�__annotations__r�intrrr�listr�propertyr�r!r r r s�������!� �!�"�8^�`d�7e�f�f�f�D�#�f�f�f�&�[�&�q�<b�dh�;i�j�j�j�I�s�j�j�j�)� �)�!�?e�gk�>l�m�m�m�L�#�m�m�m�)-�M�8�D��I�&�-�-�-� #4�+�"3�� F��M�#�#�#�L�(�3�-�����.�.��X�.�.�.r!r c�J�eZdZUdZeed<ed���Zed���ZdS)� SubSplitInfoz�Wrapper around a sub split info. This class expose info on the subsplit: ``` ds, info = datasets.load_dataset(..., split='train[75%:]', with_info=True) info.splits['train[75%:]'].num_examples ``` rc��|jjS)z.Returns the number of example in the subsplit.)rr�rs r rzSubSplitInfo.num_examplesFs��� �-�-r!c��|jjS)r)rrr/s r rzSubSplitInfo.file_instructionsKs��� �2�2r!N) r"r#r$�__doc__rr'r*rrr+r!r r-r-:sb���������#�"�"�"� �.�.��X�.��3�3��X�3�3�3r!r-c�L�eZdZdZejd���Zd�Zd�Zd�Z dd�Z dS) � SplitBasea�Abstract base class for Split compositionality. See the [guide on splits](../loading#slice-splits) for more information. There are three parts to the composition: 1) The splits are composed (defined, merged, split,...) together before calling the `.as_dataset()` function. This is done with the `__add__`, `__getitem__`, which return a tree of `SplitBase` (whose leaf are the `NamedSplit` objects) ``` split = datasets.Split.TRAIN + datasets.Split.TEST.subsplit(datasets.percent[:50]) ``` 2) The `SplitBase` is forwarded to the `.as_dataset()` function to be resolved into actual read instruction. This is done by the `.get_read_instruction()` method which takes the real dataset splits (name, number of shards,...) and parse the tree to return a `SplitReadInstruction()` object ``` read_instruction = split.get_read_instruction(self.info.splits) ``` 3) The `SplitReadInstruction` is then used in the `tf.data.Dataset` pipeline to define which files to read and how to skip examples within file. c� �td���)z�Parse the descriptor tree and compile all read instructions together. Args: split_dict: `dict`, The `dict[split_name, SplitInfo]` of the dataset Returns: split_read_instruction: `SplitReadInstruction` zAbstract method)�NotImplementedError�r� split_dicts r �get_read_instructionzSplitBase.get_read_instructionts��"�"3�4�4�4r!c�\�t|ttf��rdStd���)�*Equality: datasets.Split.TRAIN == 'train'.Fz6Equality is not implemented between merged/sub splits.)� isinstance� NamedSplitrr5�r�others r �__eq__zSplitBase.__eq__�s-�� �e�j�#�.� /� /� ��5�!�"Z�[�[�[r!c�.�|�|�� S)z+InEquality: datasets.Split.TRAIN != 'test'.)r?r=s r �__ne__zSplitBase.__ne__�s���;�;�u�%�%�%�%r!c�"�t||��S)z4Merging: datasets.Split.TRAIN + datasets.Split.TEST.)� _SplitMergedr=s r �__add__zSplitBase.__add__�s���D�%�(�(�(r!Nc��� � �td�||||fD����dkrtd���t|t��r|}n/t|t��r|}nt|t ��r|}|s|s|std|�d����d�}|r�d|cxkrdksntd |�����d|z� � fd �t |��D��}t |d jd��|d <||��t�fd �|D����S|rt�|��S|r�t|��� � fd �|D��}d}d}g}|D],} || z }|� t ||����|}�-t |d jd��|d <||��t�fd�|D����Std���)a6Divides this split into subsplits. There are 3 ways to define subsplits, which correspond to the 3 arguments `k` (get `k` even subsplits), `percent` (get a slice of the dataset with `datasets.percent`), and `weighted` (get subsplits with proportions specified by `weighted`). Example:: ``` # 50% train, 50% test train, test = split.subsplit(k=2) # 50% train, 25% test, 25% validation train, test, validation = split.subsplit(weighted=[2, 1, 1]) # Extract last 20% subsplit = split.subsplit(datasets.percent[-20:]) ``` Warning: k and weighted will be converted into percent which mean that values below the percent will be rounded up or down. The final split may be bigger to deal with remainders. For instance: ``` train, test, valid = split.subsplit(k=3) # 33%, 33%, 34% s1, s2, s3, s4 = split.subsplit(weighted=[2, 2, 1, 1]) # 33%, 33%, 16%, 18% ``` Args: arg: If no kwargs are given, `arg` will be interpreted as one of `k`, `percent`, or `weighted` depending on the type. For example: ``` split.subsplit(10) # Equivalent to split.subsplit(k=10) split.subsplit(datasets.percent[:-20]) # percent=datasets.percent[:-20] split.subsplit([1, 1, 2]) # weighted=[1, 1, 2] ``` k: `int` If set, subdivide the split into `k` equal parts. percent: `datasets.percent slice`, return a single subsplit corresponding to a slice of the original split. For example: `split.subsplit(datasets.percent[-20:]) # Last 20% of the dataset`. weighted: `list[int]`, return a list of subsplits whose proportions match the normalized sum of the list. For example: `split.subsplit(weighted=[1, 1, 2]) # 25%, 25%, 50%`. Returns: A subsplit or list of subsplits extracted from this split object. c3�4K�|]}t|��V��dS�N)�bool)�.0�xs r � <genexpr>z%SplitBase.subsplit.<locals>.<genexpr>�s(����<�<�1�t�A�w�w�<�<�<�<�<�<r!rz,Only one argument of subsplit should be set.zInvalid split argument zg. Only list, slice and int supported. One of k, weighted or percent should be set to a non empty value.c�z�td�|D��g��ttd����ksJ�dS)Nc3�hK�|]-}tt|�d�����V��.dS)�dN)r)�range�indices�rI�ss r rKzESplitBase.subsplit.<locals>.assert_slices_coverage.<locals>.<genexpr>�s9����E�E���U�A�I�I�c�N�N�3�4�4�E�E�E�E�E�Er!rN)�sumr)rO)�slicess r �assert_slices_coveragez2SplitBase.subsplit.<locals>.assert_slices_coverage�sD���E�E�f�E�E�E�r�J�J�d�SX�Y\�S]�S]�N^�N^�^�^�^�^�^�^r!rrNz,Subsplit k should be between 0 and 100, got c�B��g|]}t|�z|dz�z����S)r)�slice)rI�i�shifts �r � <listcomp>z&SplitBase.subsplit.<locals>.<listcomp>�s/���J�J�J�A�e�A��I��A����7�7�J�J�Jr!�����c3�8�K�|]}t�|��V��dSrG�� _SubSplit�rIrRrs �r rKz%SplitBase.subsplit.<locals>.<genexpr>��-�����<�<���4��+�+�<�<�<�<�<�<r!c� ��g|] }d|z�z�� S)rNr+)rIrJ�totals �r rZz&SplitBase.subsplit.<locals>.<listcomp>�s"���;�;�;�Q��a��5�(�;�;�;r!c3�8�K�|]}t�|��V��dSrGr]r_s �r rKz%SplitBase.subsplit.<locals>.<genexpr>�r`r!zCould not determine the split) rS� ValueErrorr;r(rWr)rO�start�tupler^�append) r�arg�k�percent�weightedrUrTre�stop�vrYrbs ` @@r �subsplitzSplitBase.subsplit�s}�����d �<�<��a��(� ;�<�<�<� <� <�� A� A��K�L�L� L� �c�3� � � ��A�A� ��U� #� #� ��G�G� ��T� "� "� ��H�� �W� �� ��T�#�T�T�T��� �  _� _� _� � >��q�<�<�<�<�C�<�<�<�<� �!S�PQ�!S�!S�T�T�T��1�H�E�J�J�J�J��q���J�J�J�F��v�b�z�/��5�5�F�2�J� "� "�6� *� *� *��<�<�<�<�V�<�<�<�<�<� <� � >��T�7�+�+� +� � >���M�M�E�;�;�;�;�(�;�;�;�H��E��D��F�� � ���� ��� � �e�E�4�0�0�1�1�1�����v�b�z�/��5�5�F�2�J� "� "�6� *� *� *��<�<�<�<�V�<�<�<�<�<� <��<�=�=� =r!)NNNN) r"r#r$r1�abc�abstractmethodr8r?rArDrnr+r!r r3r3Qs���������B �� 5� 5��� 5�\�\�\� &�&�&�)�)�)�f>�f>�f>�f>�f>�f>r!r3)� metaclassc��eZdZd�ZdS)�PercentSliceMetac�T�t|t��std|�����|S)Nz7datasets.percent should only be called with slice, not )r;rWrd)�cls� slice_values r � __getitem__zPercentSliceMeta.__getitem__�s3���+�u�-�-� f��d�Wb�d�d�e�e� e��r!N)r"r#r$rwr+r!r rsrs�s#����������r!rsc��eZdZdZdS)� PercentSlicez�Syntactic sugar for defining slice subsplits: `datasets.percent[75:-5]`. See the [guide on splits](../loading#slice-splits) for more information. N)r"r#r$r1r+r!r ryrys�������� �Dr!ryc�$�eZdZdZd�Zd�Zd�ZdS)rCz0Represent two split descriptors merged together.c�"�||_||_dSrG)�_split1�_split2)r�split1�split2s r �__init__z_SplitMerged.__init__s���� ��� � � r!c�t�|j�|��}|j�|��}||zSrG)r|r8r})rr7�read_instruction1�read_instruction2s r r8z!_SplitMerged.get_read_instructions:�� �L�=�=�j�I�I�� �L�=�=�j�I�I�� �#4�4�4r!c�\�dt|j���dt|j���d�S)N�(z + �))�reprr|r}r/s r �__repr__z_SplitMerged.__repr__ s/��?�4�� �%�%�?�?�$�t�|�*<�*<�?�?�?�?r!N�r"r#r$r1r�r8r�r+r!r rCrCsL������:�:����5�5�5� @�@�@�@�@r!rCc�$�eZdZdZd�Zd�Zd�ZdS)r^z,Represent a sub split of a split descriptor.c�"�||_||_dSrG)�_split� _slice_value)r�splitrvs r r�z_SubSplit.__init__'s���� �'����r!c�L�|j�|��|jSrG)r�r8r�r6s r r8z_SubSplit.get_read_instruction+s ���{�/�/� �;�;�D�<M�N�Nr!c��d}|jj�|dz }|�|jj�dn |jj|jj�dn |jj|jj���}t |j���d|�d�S)Nz{start}:{stop}z:{step}r)rerl�stepz(datasets.percent[z]))r�r��formatrerlr�r�)r� slice_strs r r�z_SubSplit.__repr__.s���$� � � � !� -� �� "�I��$�$��)�/�7�"�"�T�=N�=T��(�-�5���4�;L�;Q��"�'�%� � � � �t�{�#�#�D�D�y�D�D�D�Dr!Nr�r+r!r r^r^$sO������6�6�(�(�(�O�O�O� E� E� E� E� Er!r^c�<�eZdZdZd�Zd�Zd�Zd�Zd�Zd�Z d�Z d S) r<a�Descriptor corresponding to a named split (train, test, ...). Example: Each descriptor can be composed with other using addition or slice: ```py split = datasets.Split.TRAIN.subsplit(datasets.percent[0:25]) + datasets.Split.TEST ``` The resulting split will correspond to 25% of the train split merged with 100% of the test split. A split cannot be added twice, so the following will fail: ```py split = ( datasets.Split.TRAIN.subsplit(datasets.percent[:25]) + datasets.Split.TRAIN.subsplit(datasets.percent[75:]) ) # Error split = datasets.Split.TEST + datasets.Split.ALL # Error ``` The slices can be applied only one time. So the following are valid: ```py split = ( datasets.Split.TRAIN.subsplit(datasets.percent[:25]) + datasets.Split.TEST.subsplit(datasets.percent[:50]) ) split = (datasets.Split.TRAIN + datasets.Split.TEST).subsplit(datasets.percent[:50]) ``` But this is not valid: ```py train = datasets.Split.TRAIN test = datasets.Split.TEST split = train.subsplit(datasets.percent[:25]).subsplit(datasets.percent[:25]) split = (train.subsplit(datasets.percent[:25]) + test).subsplit(datasets.percent[:50]) ``` c���||_d�|�d��D��}|D]7}tjt|��st dt�d|�d�����8dS)Nc�D�g|]}|�d��d��S)�[r)r�)rI�split_instructions r rZz'NamedSplit.__init__.<locals>.<listcomp>gs-��'q�'q�'q�L]�(9�(?�(?��(D�(D�Q�(G�'q�'q�'qr!�+zSplit name should match 'z ' but got 'z'.)�_namer��re�matchr rd)rr�split_names_from_instruction� split_names r r�zNamedSplit.__init__es����� �'q�'q�ae�ak�ak�lo�ap�ap�'q�'q�'q�$�6� c� c�J��8�I�z�2�2� c� �!a�Y�!a�!a�S]�!a�!a�!a�b�b�b� c� c� cr!c��|jSrG�r�r/s r �__str__zNamedSplit.__str__ls ���z�r!c��d|j�d�S)Nz NamedSplit(r�r�r/s r r�zNamedSplit.__repr__os��,�T�Z�,�,�,�,r!c��t|t��r|j|jkSt|t��rdSt|t��r |j|kSdS)r:F)r;r<r�r3rr=s r r?zNamedSplit.__eq__rs^�� �e�Z� (� (� ��:���,� ,� ��y� )� )� ��5� ��s� #� #� ��:��&� &��5r!c�"�|j|jkSrGr�r=s r �__lt__zNamedSplit.__lt__}s���z�E�K�'�'r!c�*�t|j��SrG)�hashr�r/s r �__hash__zNamedSplit.__hash__�s���D�J���r!c�6�t||j��SrG)�SplitReadInstructionr�r6s r r8zNamedSplit.get_read_instruction�s��#�J�t�z�$:�;�;�;r!N) r"r#r$r1r�r�r�r?r�r�r8r+r!r r<r<:s�������(�(�Tc�c�c����-�-�-� � � �(�(�(� � � �<�<�<�<�<r!r<c�.��eZdZdZ�fd�Zd�Zd�Z�xZS)� NamedSplitAllz?Split corresponding to the union of all defined dataset splits.c�J��t���d��dS)N�all)�superr�)r� __class__s �r r�zNamedSplitAll.__init__�s!��� ����������r!c��dS)NzNamedSplitAll()r+r/s r r�zNamedSplitAll.__repr__�s�� � r!c�v�d�|���D��}t|t����S)Nc�,�g|]}t|����Sr+)r�rQs r rZz6NamedSplitAll.get_read_instruction.<locals>.<listcomp>�s!��R�R�R��1�!�4�4�R�R�Rr!)�valuesrSr�)rr7�read_instructionss r r8z"NamedSplitAll.get_read_instruction�s:��R�R�j�>O�>O�>Q�>Q�R�R�R���$�&:�&<�&<�=�=�=r!)r"r#r$r1r�r�r8� __classcell__�r�s@r r�r��s\�������I�I� � � � � �!�!�!�>�>�>�>�>�>�>r!r�c�n�eZdZdZed��Zed��Zed��Ze��Z d�Z dS)�Splita"`Enum` for dataset splits. Datasets are typically split into different subsets to be used at various stages of training and evaluation. - `TRAIN`: the training data. - `VALIDATION`: the validation data. If present, this is typically used as evaluation data while iterating on a model (e.g. changing hyperparameters, model architecture, etc.). - `TEST`: the testing data. This is the data to report metrics on. Typically you do not want to use this during model iteration as you may overfit to it. - `ALL`: the union of all defined dataset splits. All splits, including compositions inherit from `datasets.SplitBase`. See the [guide](../load_hub#splits) on splits for more information. Example: ```py >>> datasets.SplitGenerator( ... name=datasets.Split.TRAIN, ... gen_kwargs={"split_key": "train", "files": dl_manager.download_and extract(url)}, ... ), ... datasets.SplitGenerator( ... name=datasets.Split.VALIDATION, ... gen_kwargs={"split_key": "validation", "files": dl_manager.download_and extract(url)}, ... ), ... datasets.SplitGenerator( ... name=datasets.Split.TEST, ... gen_kwargs={"split_key": "test", "files": dl_manager.download_and extract(url)}, ... ) ``` �train�test� validationc�H�|dkrt��nt|��S)z9Create a custom split with datasets.Split('custom_name').r�)r�r<)rurs r �__new__z Split.__new__�s ��"&�%�-�-�}����Z��5E�5E�Er!N) r"r#r$r1r<�TRAIN�TEST� VALIDATIONr��ALLr�r+r!r r�r��sm������!�!�H �J�w� � �E� �:�f� � �D���L�)�)�J� �-�/�/�C�F�F�F�F�Fr!r��SlicedSplitInfo� split_inforvc�2�eZdZdZdd�Zd�Zd�Zd�Zd�ZdS) r�aObject containing the reading instruction for the dataset. Similarly to `SplitDescriptor` nodes, this object can be composed with itself, but the resolution happens instantaneously, instead of keeping track of the tree, such as all instructions are compiled and flattened in a single SplitReadInstruction object containing the list of files and slice to use. Once resolved, the instructions can be accessed with: ``` read_instructions.get_list_sliced_split_info() # List of splits to use ``` Nc��td���|_|r&|�t|d�����dSdS)Nz?Overlap between splits. Split {key} has been added with itself.)� error_msg)r�rv)r �_splits�addr�)rr�s r r�zSplitReadInstruction.__init__�sO��%�0q�r�r�r�� � � O� �H�H�_� ��M�M�M� N� N� N� N� N� O� Or!c�.�||j|jj<dS)z,Add a SlicedSplitInfo the read instructions.N)r�r�r)r� sliced_splits r r�zSplitReadInstruction.add�s�� 6B�� �\�,�1�2�2�2r!c��t��}|j�|j��|j�|j��|S)zMerging split together.)r�r��update)rr>r�s r rDzSplitReadInstruction.__add__�sH�� 1�2�2���!�(�(���6�6�6��!�(�(���7�7�7� � r!c��t��}|j���D]^}|j�t d|jj�d����|���}||d<|�tdi|�����_|S)z Sub-splits.NzTrying to slice Split z which has already been slicedrvr+) r�r�r�rvrdr�r�_asdictr�r�)rrvr�rms r rwz SplitReadInstruction.__getitem__�s���1�2�2����$�$�&�&� 8� 8�A��}�(� �!k�!�,�:K�!k�!k�!k�l�l�l�� � � � �A�*�A�m� � � !� !�/�"6�"6�A�"6�"6� 7� 7� 7� 7� � r!c�N�t|j�����SrG)r)r�r�r/s r �get_list_sliced_split_infoz/SplitReadInstruction.get_list_sliced_split_infos���D�L�'�'�)�)�*�*�*r!rG) r"r#r$r1r�r�rDrwr�r+r!r r�r��su������ � �O�O�O�O� B�B�B�!�!�!� !� !� !�+�+�+�+�+r!r�c���eZdZdZdd��fd� Zdeeeff�fd� Zdeeefde f�fd� Z d e f�fd � Z e d ���Z edd eeefd eefd���Zd�Zd�Zdefd�Zededdfd���Z�xZS)� SplitDictzSplit info object.N�rc�H��t��j|i|��||_dSrG)r�r�r)rr�args�kwargsr�s �r r�zSplitDict.__init__ s-��������$�)�&�)�)�)�(����r!�keyc����t|��|vr.t���t|����St|j|���|���}t |��S)Nr)rr�rwrrr�r-)rr�rr�s �r rwzSplitDict.__getitem__sl��� �s�8�8�t� � ��7�7�&�&�s�3�x�x�0�0� 0�2��&� �K�K�M�M�����L�  � �-�-� -r!�valuec���||jkrtd|�d|j�d����t���||��dS)Nz!Cannot add elem. (key mismatch: 'z' != 'z'))rrdr�� __setitem__)rr�r�r�s �r r�zSplitDict.__setitem__sT��� �%�*� � ��Z��Z�Z�E�J�Z�Z�Z�[�[� [� �����C��'�'�'�'�'r!r�c���|j|vrtd|j�d����|j|_t���|j|��dS)zAdd the split info.zSplit z already presentN)rrdrr�r�)rr�r�s �r r�z SplitDict.add sY��� �?�d� "� "��G�j�o�G�G�G�H�H� H�"&�"3� �� �����J�O�Z�8�8�8�8�8r!c�X�td�|���D����S)z$Return the total number of examples.c3�$K�|] }|jV�� dSrG)rrQs r rKz/SplitDict.total_num_examples.<locals>.<genexpr>*s$����9�9�a�1�>�9�9�9�9�9�9r!)rSr�r/s r �total_num_exampleszSplitDict.total_num_examples's)���9�9�4�;�;�=�=�9�9�9�9�9�9r!rrc�B�t|t��r!t|�����}|�|r|d�d��nd}||���}|D]8}t|t��r t di|��}|�|���9|S)zIReturns a new SplitDict initialized from a Dict or List of `split_infos`.Nrrr�r+)r;�dictr)r��getr r�)rurrr7r�s r �from_split_dictzSplitDict.from_split_dict,s��� �k�4� (� (� 5��{�1�1�3�3�4�4�K� � �AL�V�;�q�>�-�-�n�=�=�=�RV�L��S�l�3�3�3� �%� '� '�J��*�d�+�+� 5�&�4�4��4�4� � �N�N�:� &� &� &� &��r!c��g}|���D]5\}}tj|��}||_|�|���6|S)z0Returns a list of SplitInfo protos that we have.)�items�copy�deepcopyrrg)r�outr�r�s r � to_split_dictzSplitDict.to_split_dict>sS����&*�j�j�l�l� #� #� "�J� ���z�2�2�J�(�J�O� �J�J�z� "� "� "� "�� r!c�f�t�|���|j��SrG)r�r�r�rr/s r r�zSplitDict.copyGs'���(�(��);�);�)=�)=�t�?P�Q�Q�Qr!�returnc��d�|���D��}|D]}|�dd���|D]}|�dd���|S)Nc�,�g|]}t|����Sr+)r rQs r rZz+SplitDict._to_yaml_list.<locals>.<listcomp>Ks��7�7�7�Q�v�a�y�y�7�7�7r!rr)r��pop)rr��split_info_dicts r � _to_yaml_listzSplitDict._to_yaml_listJsv��7�7�$�"4�"4�"6�"6�7�7�7��"� 7� 7�O� � � ��� 6� 6� 6� 6�"� 6� 6�O� � � ��� 5� 5� 5� 5�� r!� yaml_datac�,�|�|��SrG)r�)rur�s r �_from_yaml_listzSplitDict._from_yaml_listTs���"�"�9�-�-�-r!rG)r"r#r$r1r�rr3rrwr r�r�r*r�� classmethodr)r�rr�r�r�r�r�r�r�s@r r�r�s����������+/�)�)�)�)�)�)�)� .�u�Y��^�4� .� .� .� .� .� .�(�u�Y��^�4�(�Y�(�(�(�(�(�(� 9�i�9�9�9�9�9�9��:�:��X�:����%��d� �*;��8�TW�=�����[��"���R�R�R��t������.��.��.�.�.��[�.�.�.�.�.r!r�c�|�eZdZUdZeed<eje���Z eed<ejd���Z e ed<d�Z d S) �SplitGeneratora�Defines the split information for the generator. This should be used as returned value of `GeneratorBasedBuilder._split_generators`. See `GeneratorBasedBuilder._split_generators` for more info and example of usage. Args: name (`str`): Name of the `Split` for which the generator will create the examples. **gen_kwargs (additional keyword arguments): Keyword arguments to forward to the `DatasetBuilder._generate_examples` method of the builder. Example: ```py >>> datasets.SplitGenerator( ... name=datasets.Split.TRAIN, ... gen_kwargs={"split_key": "train", "files": dl_manager.download_and_extract(url)}, ... ) ``` r)�default_factory� gen_kwargsF)�initr�c��t|j��|_t|j��t|j���|_dS)N)r)rrr<r r�r/s r � __post_init__zSplitGenerator.__post_init__xs9���� �N�N�� ��4�9����#���3�3�3����r!N) r"r#r$r1rr'r%r&r�r�r�r r�r+r!r r�r�Ysz���������2 �I�I�I�(�{�(��>�>�>�J��>�>�>�-�K�-�5�9�9�9�J� �9�9�9�4�4�4�4�4r!r�)%r1ro� collectionsr�r%r�r�typingrr� arrow_readerrr�namingr �utils.py_utilsr r r r-�ABCMetar3�typersryrjrCr^r<r�r�� namedtupler�r�r�r�r�r+r!r �<module>rs`�� �� � � � ����� � � � ����� � � � �!�!�!�!�!�!�"�"�"�"�"�"�"�"�B�B�B�B�B�B�B�B�������2�2�2�2�2�2�2�2� �.�.�.�.�.�.�.� ��.�4 �3�3�3�3�3�3�3� ��3�,c>�c>�c>�c>�c>�#�+�c>�c>�c>�c>�X�����t����  �  �  �  �  �-�  �  �  �  � �� @� @� @� @� @�9� @� @� @� E�E�E�E�E� �E�E�E�,J<�J<�J<�J<�J<��J<�J<�J<�Z >� >� >� >� >�J� >� >� >�-F�-F�-F�-F�-F�-F�-F�-F�b)�+�(��������4+�4+�4+�4+�4+�4+�4+�4+�nO.�O.�O.�O.�O.��O.�O.�O.�d �!4�!4�!4�!4�!4�!4�!4� ��!4�!4�!4r!
Memory