�
%�g� � � � d dl Z d dlZddlmZmZ ddlmZmZmZm Z e � � r d dl
mc mZ
d� Z G d� dej j � � Zdefd �ZdS )
� N� )�AcceleratorState�
GradientState)�DistributedType�
honor_type�is_lomo_available�is_torch_xla_availablec �l �� t | t t f� � rt | �fd�| D � � � � S t | t � � r6 t | � � �fd�| � � � D � � � � S t | t j � � r| � �� � S | S )Nc 3 �8 �K � | ]}t |�� � V � �d S �N��move_to_device)�.0�t�devices ��d/home/asafur/pinokio/api/open-webui.git/app/env/lib/python3.11/site-packages/accelerate/optimizer.py� <genexpr>z!move_to_device.<locals>.<genexpr> s- �� � � �!K�!K��.��F�";�";�!K�!K�!K�!K�!K�!K� c �8 �� i | ]\ }}|t |�� � ��S � r
)r �k�vr s �r �
<dictcomp>z"move_to_device.<locals>.<dictcomp> s) �� �S�S�S�T�Q��A�~�a��8�8�S�S�Sr )
�
isinstance�list�tupler �dict�type�items�torch�Tensor�to)�stater s `r r r s� �� ��%�$���'�'� ��%�!K�!K�!K�!K�U�!K�!K�!K�L�L�L� �E�4� � � ��t�E�{�{�S�S�S�S�U�[�[�]�]�S�S�S�T�T�T� �E�5�<� (� (� ��x�x������Lr c � � e Zd ZdZdd�Zed� � � Zej d� � � Zed� � � Zej d� � � Zed � � � Z e j d
� � � Z d� Z
d� Zd
� Zdd�Z
d� Zd� Zdd�Zd� Zed� � � Zd� Zd� ZdS )�AcceleratedOptimizera�
Internal wrapper around a torch optimizer.
Conditionally will perform `step` and `zero_grad` if gradients should be synchronized when performing gradient
accumulation.
Args:
optimizer (`torch.optim.optimizer.Optimizer`):
The optimizer to wrap.
device_placement (`bool`, *optional*, defaults to `True`):
Whether or not the optimizer should handle device placement. If so, it will place the state dictionary of
`optimizer` on the right device.
scaler (`torch.cuda.amp.grad_scaler.GradScaler`, *optional*):
The scaler to use in the step function if training with mixed precision.
TNc � � || _ || _ t � � | _ t � � | _ || _ d| _ | j �7d| _ | j j | _
t | | j j � � | _ |r�| j �
� � }| j j t j k r t# j || j j � � nt) || j j � � }| j � |� � d S d S �NF)� optimizer�scalerr �accelerator_stater �gradient_state�device_placement�_is_overflow�_accelerate_step_called�step�_optimizer_original_step_method�patch_optimizer_step�_optimizer_patched_step_method�
state_dict�distributed_typer �XLA�xm�send_cpu_data_to_devicer r �load_state_dict)�selfr( r, r) r3 s r �__init__zAcceleratedOptimizer.__init__6 s� � �"������!1�!3�!3���+�o�o��� 0���!����;�"�+0�D�(�37�>�3F�D�0�2F�t�T�^�M`�2a�2a�D�/� � 7���2�2�4�4�J��%�6�/�:M�M�M��*�:�t�7M�7T�U�U�U�U�+�J��8N�8U�V�V�
��N�*�*�:�6�6�6�6�6�
7� 7r c � � | j j S r �r( r# �r9 s r r# zAcceleratedOptimizer.stateL s
� ��~�#�#r c � � || j _ d S r r<