� %�g�� 4�dZddlmZddlmZmZddlZddlmZm Z e��rddl mZdejj fd �Zd edeedefd �Zd edefd�Ze dejj fd��Ze defdejj deddeefd��ZdS)z, Needed utilities for torchao FP8 training. �)�partial)�Callable�OptionalN�)�is_torchao_available�torchao_required)�Float8LinearConfig�modelc��d\}}|��D]*\}}t|tjj��r|�|}|}�+||fS)z� Finds the first and last linear layer names in a model. This is needed during FP8 to avoid issues with instability by keeping the first and last layers unquantized. Ref: https://x.com/xariusrke/status/1826669142604141052 )NN)� named_modules� isinstance�torch�nn�Linear)r �first_linear�last_linear�name�modules �c/home/asafur/pinokio/api/open-webui.git/app/env/lib/python3.11/site-packages/accelerate/utils/ao.py�find_first_last_linear_layersrs_��!+��L�+��+�+�-�-��f��f�e�h�o�.�.� ��#�#��K��$�$��fqn�layers_to_filter�returnc��t|tjj��r|jdzdks|jdzdkrdS||vrdSdS)a� A function which will check if `module` is: - a `torch.nn.Linear` layer - has in_features and out_features divisible by 16 - is not part of `layers_to_filter` Args: module (`torch.nn.Module`): The module to check. fqn (`str`): The fully qualified name of the layer. layers_to_filter (`List[str]`): The list of layers to filter. �rFT)r rrr�in_features�out_features)rrrs r�filter_linear_layersr0sZ��&�%�(�/�*�*��"�a�'�'�6�+>��+C�q�+H�+H��5� ��u��4rc�N�t|��\}}t||||g��S)a� A filter function which will filter out all linear layers except the first and last. <Tip> For stability reasons, we skip the first and last linear layers Otherwise can lead to the model not training or converging properly </Tip> Args: module (`torch.nn.Module`): The module to check. fqn (`str`): The fully qualified name of the layer. �r)rr)rrrrs r�#filter_first_and_last_linear_layersr"Gs0��"!>�f� E� E��L�+��|�[�>Y�Z�Z�Z�Zrc�l�ddlm}|��D]\}}t||��rdS�dS)Nr)�Float8LinearTF)�torchao.float8.float8_linearr$rr )r r$rrs r� has_ao_layersr&\sV��9�9�9�9�9�9��+�+�-�-��f��f�l�+�+� ��4�4� ��5r�configr �module_filter_funcc��ddlm}t|��\}}|�tt||g��}||||��dS)a Converts all `nn.Linear` layers in the model (except the first and last) to torchao's `Float8Linear` layer inplace. Args: model (`torch.nn.Module`): The model to convert. config (`torchao.float8.Float8LinearConfig`, *optional*): The configuration for the FP8 training. Recommended to utilize `torchao.float8.recipe_name_to_linear_config` to generate this. In general, the default config should be sufficient (what is passed when set to `None`). module_filter_func (`Callable`, *optional*, defaults to `filter_linear_layers`): Optional function that must take in a module and layer name, and returns a boolean indicating whether the module should be converted to FP8. Defaults to `filter_linear_layers`. See it for an example. Example: ```python from accelerate.utils.ao import convert_model_to_fp8_ao model = MyModel() model.to("cuda") convert_to_float8_training(model) model.train() ``` r)�convert_to_float8_trainingNr!)�module_filter_fnr')�torchao.float8r*rrr)r r'r(r*rrs r�convert_model_to_fp8_aor-fsi��@:�9�9�9�9�9� =�e� D� D��L�+��!�$�%9�\�[f�Lg�h�h�h��u�7I�RX�Y�Y�Y�Y�Y�Yr)�__doc__� functoolsr�typingrrr�importsrrr%r r�Moduler�str�list�boolrr"r&r-�rr�<module>r7s��%�%�%�%�%�%�%�%��;�;�;�;�;�;�;�;��@�?�?�?�?�?�?�%��%�%�%�%�"�c��T�#�Y��4��.[�S�[�T�[�[�[�[�*��.2�-P�$Z�$Z��8�?�$Z��)�*�$Z�!��*�$Z�$Z�$Z��$Z�$Z�$Zr