�
<��g7 � �X � d dl Z d dlmZ d dlmZmZmZ d dlZ G d� d� � ZdZ dZ
dS )� N)�cached_property)�List�Optional�Tuplec
�� � e Zd ZdZ ddej dedee dee fd�Z e
defd �� � Ze
defd
�� � Z
e
defd�� � Ze
defd�� � Ze
defd
�� � Ze
defd�� � Ze
defd�� � Zedefd�� � Zedee fd�� � Zdedee fd�Zdee defd�Zdee defd�Ze
dee fd�� � Zdee deee eee f fd�Zdee deee eee f fd�Zdee deee eee f fd�ZdS )� Tokenizerz-Simple wrapper around a tokenizers.Tokenizer.N� tokenizer�multilingual�task�languagec � � || _ |r�|t vr.t d|�dd� t � � �d�� � �|t vr.t d|�dd� t � � �d�� � �| j � d|z � � | _ | j � d|z � � | _ || _ d S d | _ d | _ d| _ d S )N�'z'' is not a valid task (accepted tasks: z, �)z9' is not a valid language code (accepted language codes: z<|%s|>�en) r �_TASKS�
ValueError�join�_LANGUAGE_CODES�token_to_idr r �
language_code)�selfr r
r r s �h/home/asafur/pinokio/api/open-webui.git/app/env/lib/python3.11/site-packages/faster_whisper/tokenizer.py�__init__zTokenizer.__init__ s� � � #���� &��6�!�!� �j��t�t�T�Y�Y�v�.�.�.�.�0�� � �
��.�.� �j��x�x����?�!;�!;�!;�!;�=�� � �
��2�2�8�d�?�C�C�D�I� �N�6�6�x�(�7J�K�K�D�M�!)�D�����D�I� �D�M�!%�D���� �returnc �6 � | j � d� � S )Nz<|transcribe|>�r r �r s r �
transcribezTokenizer.transcribe* s � ��~�)�)�*:�;�;�;r c �6 � | j � d� � S )Nz
<|translate|>r r s r � translatezTokenizer.translate. � � ��~�)�)�/�:�:�:r c �6 � | j � d� � S )Nz<|startoftranscript|>r r s r �sotz
Tokenizer.sot2 s � ��~�)�)�*A�B�B�Br c �6 � | j � d� � S )Nz
<|startoflm|>r r s r �sot_lmzTokenizer.sot_lm6 r"