�
%�g� � �4 � d Z ddlmZ ddlmZmZ ddlZddlmZm Z e� � rddl
mZ dej j
fd �Zd
edee defd
�Zd
edefd�Ze dej j
fd�� � Ze defdej j
ded dee fd�� � ZdS )z,
Needed utilities for torchao FP8 training.
� )�partial)�Callable�OptionalN� )�is_torchao_available�torchao_required)�Float8LinearConfig�modelc � � d\ }}| � � � D ]*\ }}t |t j j � � r|�|}|}�+||fS )z�
Finds the first and last linear layer names in a model.
This is needed during FP8 to avoid issues with instability by keeping the first and last layers unquantized.
Ref: https://x.com/xariusrke/status/1826669142604141052
)NN)�
named_modules�
isinstance�torch�nn�Linear)r
�first_linear�last_linear�name�modules �c/home/asafur/pinokio/api/open-webui.git/app/env/lib/python3.11/site-packages/accelerate/utils/ao.py�find_first_last_linear_layersr s_ � � !+��L�+��+�+�-�-� � ���f��f�e�h�o�.�.� ��#�#���K����$�$� �fqn�layers_to_filter�returnc � � t | t j j � � r| j dz dk s| j dz dk rdS ||v rdS dS )a�
A function which will check if `module` is:
- a `torch.nn.Linear` layer
- has in_features and out_features divisible by 16
- is not part of `layers_to_filter`
Args:
module (`torch.nn.Module`):
The module to check.
fqn (`str`):
The fully qualified name of the layer.
layers_to_filter (`List[str]`):
The list of layers to filter.
� r FT)r
r r r �in_features�out_features)r r r s r �filter_linear_layersr 0 sZ � � �&�%�(�/�*�*� ����"�a�'�'�6�+>��+C�q�+H�+H��5�
�����u��4r c �N � t | � � \ }}t | |||g�� � S )a�
A filter function which will filter out all linear layers except the first and last.
<Tip>
For stability reasons, we skip the first and last linear layers Otherwise can lead to the model not training or
converging properly
</Tip>
Args:
module (`torch.nn.Module`):
The module to check.
fqn (`str`):
The fully qualified name of the layer.
�r )r r )r r r r s r �#filter_first_and_last_linear_layersr"