� K�g�,����ddlmZmZmZddlmZddlmZm Z ddl m Z ddl m Z ddl mZddl mZmZdd l mZmZmZdd l mZmZdd l mZdd lmZdd lmZee��Ze dkreZne ZGd�de��ZdS)�)�absolute_import�division�unicode_literals)�unichr)�deque� OrderedDict)� version_info�)�spaceCharacters)�entities)� asciiLetters�asciiUpper2Lower)�digits� hexDigits�EOF)� tokenTypes� tagTokenTypes)�replacementCharacters)�HTMLInputStream)�Trie)��c����eZdZdZdM�fd� Zd�Zd�ZdNd�Zd�Zd �Z d �Z d �Z d �Z d �Z d�Zd�Zd�Zd�Zd�Zd�Zd�Zd�Zd�Zd�Zd�Zd�Zd�Zd�Zd�Zd�Zd�Zd�Zd �Z d!�Z!d"�Z"d#�Z#d$�Z$d%�Z%d&�Z&d'�Z'd(�Z(d)�Z)d*�Z*d+�Z+d,�Z,d-�Z-d.�Z.d/�Z/d0�Z0d1�Z1d2�Z2d3�Z3d4�Z4d5�Z5d6�Z6d7�Z7d8�Z8d9�Z9d:�Z:d;�Z;d<�Z<d=�Z=d>�Z>d?�Z?d@�Z@dA�ZAdB�ZBdC�ZCdD�ZDdE�ZEdF�ZFdG�ZGdH�ZHdI�ZIdJ�ZJdK�ZKdL�ZL�xZMS)O� HTMLTokenizera  This class takes care of tokenizing HTML. * self.currentToken Holds the token that is currently being processed. * self.state Holds a reference to the method to be invoked... XXX * self.stream Points to HTMLInputStream object. Nc ����t|fi|��|_||_d|_g|_|j|_d|_d|_tt|��� ��dS�NF) r�stream�parser� escapeFlag� lastFourChars� dataState�state�escape� currentToken�superr�__init__)�selfrr�kwargs� __class__s ��c/home/asafur/pinokio/api/open-webui.git/app/env/lib/python3.11/site-packages/html5lib/_tokenizer.pyr&zHTMLTokenizer.__init__(sn���%�f�7�7��7�7�� ��� � �������^�� ��� �!��� �m�T�"�"�+�+�-�-�-�-�-�c#�fK�tg��|_|���r�|jjr;t d|jj�d��d�V�|jj�;|jr"|j���V�|j�"|�����dSdS)z� This is where the magic happens. We do our usually processing through the states and when we have a token to return we yield the token which pauses processing until the next token is requested. � ParseErrorr��type�dataN)r� tokenQueuer"r�errorsr�pop�popleft�r's r*�__iter__zHTMLTokenizer.__iter__7s����� ��)�)����j�j�l�l� 0��+�$� \�)�,�7���AS�AW�AW�XY�AZ�AZ�[�[�[�[�[��+�$� \��/� 0��o�-�-�/�/�/�/�/��/� 0��j�j�l�l� 0� 0� 0� 0� 0r+c�l�t}d}|r t}d}g}|j���}||vrD|tur;|�|��|j���}||vr |tu�;t d�|��|��}|tvr:t|}|j �tddd|id����nd|cxkrd ksn|d kr.d }|j �tddd|id���n�d |cxkrd ksBnd|cxkrdks3nd|cxkrdks$nd|cxkrdksn|tgd���vr+|j �tddd|id��� t|��}n@#t$r3|dz }td|dz z��td|dzz��z}YnwxYw|dkrB|j �tddd���|j�|��|S)z�This function returns either U+FFFD or the character based on the decimal or hexadecimal representation. It also discards ";" if present. If not present self.tokenQueue.append({"type": tokenTypes["ParseError"]}) is invoked. � ��r-z$illegal-codepoint-for-numeric-entity� charAsInt�r/r0�datavarsi�i�������r �����i��i��)#� i��i��i��i��i��i��i��i��i��i��i��i��i��i��i��i��i��i��i�� i�� i�� i�� i�� i�� i�� i�� i�� i�� i��i��i��i��i��r>ii�i��;z numeric-entity-without-semicolonr.)rrr�charr�append�int�joinrr1r� frozenset�chr� ValueError�unget) r'�isHex�allowed�radix� charStack�cr;rG�vs r*�consumeNumberEntityz!HTMLTokenizer.consumeNumberEntityGs�� ���� � ��G��E�� � �K� � � � ���7�l�l�q��|�|� � � �Q� � � �� � � �"�"�A��7�l�l�q��|�|� ���� �*�*�E�2�2� � �-� -� -�(��3�D� �O� "� "�J�|�,D�$J�1<�i�0H�$J�$J� K� K� K� K���,�,�,�,�f�,�,�,�,��8�#�#��D� �O� "� "�J�|�,D�$J�1<�i�0H�$J�$J� K� K� K� K� �9�.�.�.�.��.�.�.�.��9�.�.�.�.��.�.�.�.��9�.�.�.�.��.�.�.�.��9�.�.�.�.��.�.�.�.��Y�(E�(E�(E�F�F�F�F���&�&� �<�0H�(N�5@�)�4L�(N�(N�O�O�O� K��9�~�~����� K� K� K���'���6�Q�"�W�-�.�.��V�q�5�y�5I�1J�1J�J���� K���� ��8�8� �O� "� "�J�|�,D�$F�$H�$H� I� I� I� �K� � �a� � � �� s�F,�,:G)�(G)Fc��d}|j���g}|dtvs |dtddfvs|�.||dkr"|j�|d���n�|ddk�r-d}|�|j�����|ddvr.d}|�|j�����|r|dt vs|sF|dtvr7|j�|d��|�|��}�n�|j �td d d ���|j�|� ����dd � |��z}�nJ|dturit�d � |����sn;|�|j�����|dtu�i t�d � |dd�����}t!|��}n#t"$rd}YnwxYw|�� |dd kr(|j �td dd ���|dd krq|ro||t$vs||tvs ||dkrE|j�|� ����dd � |��z}n�t&|}|j�|� ����|d � ||d���z }nl|j �td dd ���|j�|� ����dd � |��z}|r#|jdddxx|z cc<dS|tvrd}nd}|j �t||d ���dS)N�&r�<�#F�����)�x�XTr-zexpected-numeric-entityr.r:rFznamed-entity-without-semicolon�=zexpected-named-entityr0r �SpaceCharacters� Characters)rrGr rrNrHrrrUr1rr3rJ� entitiesTrie�has_keys_with_prefix�longest_prefix�len�KeyErrorr r r$) r'� allowedChar� fromAttribute�outputrR�hex� entityName� entityLength� tokenTypes r*� consumeEntityzHTMLTokenizer.consumeEntity�sX�����[�%�%�'�'�(� � �a�L�O� +� +�y��|��S�#��/N�/N��(�[�I�a�L�-H�-H� �K� � �i��l� +� +� +� +� �q�\�S� � ��C� � � �T�[�-�-�/�/� 0� 0� 0���}� �*�*���� � ���!1�!1�!3�!3�4�4�4�� 2� �"� ��2�2��3�$-�b�M�V�$;�$;�� �!�!�)�B�-�0�0�0��1�1�#�6�6�����&�&� �<�0H�0I�(K�(K�L�L�L�� �!�!�)�-�-�/�/�2�2�2��r�w�w�y�1�1�1����R�=��+�+�#�8�8�����9K�9K�L�L���� � ���!1�!1�!3�!3�4�4�4��R�=��+�+� "�)�8�8�����3�B�3��9P�9P�Q�Q� �"�:��� � ��� "� "� "�!� � � � "�����%��b�>�S�(�(��O�*�*�J�|�4L�,L�,N�,N�O�O�O��r�N�c�)�)�m�)��|�,� �<�<��|�,��6�6��|�,��3�3��K�%�%�i�m�m�o�o�6�6�6� �2�7�7�9�#5�#5�5�F�F�%�j�1�F��K�%�%�i�m�m�o�o�6�6�6��b�g�g�i� � � �&>�?�?�?�F�F���&�&� �<�0H�(?�(A�(A�B�B�B�� �!�!�)�-�-�/�/�2�2�2��r�w�w�y�1�1�1�� � T� � �f� %�b� )�!� ,� ,� ,�� 6� ,� ,� ,� ,� ,���(�(�-� � �(� � �O� "� "�J�y�,A�6�#R�#R� S� S� S� S� Ss�!AI&�& I5�4I5c�4�|�|d���dS)zIThis method replaces the need for "entityInAttributeValueState". T)rerfN)rl)r'res r*�processEntityInAttributez&HTMLTokenizer.processEntityInAttribute�s#�� ���{�$��G�G�G�G�Gr+c��|j}|dtv�r |d�t��|d<|dtdkrZ|d}t |��}t |��t |��kr|�|ddd���||d<|dtdkr`|dr(|j� tdd d ���|d r(|j� tdd d ���|j� |��|j |_ dS) z�This method is a generic handler for emitting the tags. It also sets the state to "data" because that's what's needed after a token has been emitted. r/�name�StartTagr0NrZ�EndTagr-zattributes-in-end-tagr.� selfClosingzself-closing-flag-on-end-tag) r$r� translaterr� attributeMaprc�updater1rHr!r")r'�token�rawr0s r*�emitCurrentTokenzHTMLTokenizer.emitCurrentToken�sU�� �!�� �&�M�]� *� *�!�&�M�3�3�4D�E�E�E�&�M��V�}� �:� 6�6�6��F�m��#�C�(�(���s�8�8�c�$�i�i�'�'��K�K��D�D�b�D� �*�*�*� $��f� ��V�}� �8� 4�4�4���=�N��O�*�*�J�|�4L�4K�,M�,M�N�N�N���'�U��O�*�*�J�|�4L�4R�,T�,T�U�U�U� ����u�%�%�%��^�� � � r+c�z�|j���}|dkr|j|_�n |dkr |j|_n�|dkrQ|j�tddd���|j�tddd���n�|turdS|tvrJ|j�td ||j� td ��zd���nE|j� d ��}|j�td||zd���d S) NrWrX�r-�invalid-codepointr.r_Fr^T�rWrXr{) rrG�entityDataStater"� tagOpenStater1rHrrr � charsUntil�r'r0�charss r*r!zHTMLTokenizer.dataStatesl���{���!�!�� �3�;�;��-�D�J�J� �S�[�[��*�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� �O� "� "�J�|�,D�,4�$6�$6� 7� 7� 7� 7� �S�[�[��5� �_� $� $� �O� "� "�J�7H�,I�$(�4�;�+A�+A�/�SW�+X�+X�$X�$Z�$Z� [� [� [� [� �K�*�*�+?�@�@�E� �O� "� "�J�|�,D�$(�5�L�$2�$2� 3� 3� 3��tr+c�F�|���|j|_dS�NT)rlr!r"r5s r*r~zHTMLTokenizer.entityDataStates"�� �������^�� ��tr+c�~�|j���}|dkr|j|_�n|dkr |j|_n�|t krdS|dkrQ|j�tddd���|j�tdd d���n�|tvrJ|j�td ||j� td ��zd���nE|j� d ��}|j�td||zd���d S) NrWrXFr{r-r|r.r_r?r^Tr}) rrG�characterReferenceInRcdatar"�rcdataLessThanSignStaterr1rHrr r�r�s r*� rcdataStatezHTMLTokenizer.rcdataState"sl���{���!�!�� �3�;�;��8�D�J�J� �S�[�[��5�D�J�J� �S�[�[��5� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� �O� "� "�J�|�,D�,4�$6�$6� 7� 7� 7� 7� �_� $� $� �O� "� "�J�7H�,I�$(�4�;�+A�+A�/�SW�+X�+X�$X�$Z�$Z� [� [� [� [� �K�*�*�+?�@�@�E� �O� "� "�J�|�,D�$(�5�L�$2�$2� 3� 3� 3��tr+c�F�|���|j|_dSr�)rlr�r"r5s r*r�z(HTMLTokenizer.characterReferenceInRcdata?s#�� �������%�� ��tr+c��|j���}|dkr |j|_n�|dkrQ|j�t ddd���|j�t ddd���nR|tkrdS|j�d ��}|j�t d||zd���d S� NrXr{r-r|r.r_r?F)rXr{T) rrG�rawtextLessThanSignStater"r1rHrrr�r�s r*� rawtextStatezHTMLTokenizer.rawtextStateDs����{���!�!�� �3�;�;��6�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� �O� "� "�J�|�,D�,4�$6�$6� 7� 7� 7� 7� �S�[�[��5��K�*�*�?�;�;�E� �O� "� "�J�|�,D�$(�5�L�$2�$2� 3� 3� 3��tr+c��|j���}|dkr |j|_n�|dkrQ|j�t ddd���|j�t ddd���nR|tkrdS|j�d ��}|j�t d||zd���d Sr�) rrG�scriptDataLessThanSignStater"r1rHrrr�r�s r*�scriptDataStatezHTMLTokenizer.scriptDataStateVs����{���!�!�� �3�;�;��9�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� �O� "� "�J�|�,D�,4�$6�$6� 7� 7� 7� 7� �S�[�[��5��K�*�*�?�;�;�E� �O� "� "�J�|�,D�$(�5�L�$2�$2� 3� 3� 3��tr+c��|j���}|tkrdS|dkrQ|j�t ddd���|j�t ddd���nC|j�t d||j�d��zd���dS) NFr{r-r|r.r_r?T)rrGrr1rHrr��r'r0s r*�plaintextStatezHTMLTokenizer.plaintextStatehs����{���!�!�� �3�;�;��5� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� �O� "� "�J�|�,D�,4�$6�$6� 7� 7� 7� 7� �O� "� "�J�|�,D�$(�4�;�+A�+A�(�+K�+K�$K�$M�$M� N� N� N��tr+c�B�|j���}|dkr|j|_�nq|dkr|j|_�n]|t vr&t d|gddd�|_|j|_�n.|dkr]|j � t ddd ���|j � t d d d ���|j |_n�|d krO|j � t dd d ���|j� |��|j |_nv|j � t ddd ���|j � t d dd ���|j� |��|j |_dS)N�!�/rqF)r/rpr0rs�selfClosingAcknowledged�>r-z'expected-tag-name-but-got-right-bracketr.r_z<>�?z'expected-tag-name-but-got-question-markzexpected-tag-namerXT)rrG�markupDeclarationOpenStater"�closeTagOpenStater rr$� tagNameStater1rHr!rN�bogusCommentStater�s r*rzHTMLTokenizer.tagOpenStatews����{���!�!�� �3�;�;��8�D�J�J� �S�[�[��/�D�J�J� �\� !� !�)3�J�)?�)-�r�05�<A�!C�!C�D� ��*�D�J�J� �S�[�[� �O� "� "�J�|�,D�$M�$O�$O� P� P� P� �O� "� "�J�|�,D�d�#S�#S� T� T� T���D�J�J� �S�[�[� �O� "� "�J�|�,D�$M�$O�$O� P� P� P� �K� � �d� #� #� #��/�D�J�J� �O� "� "�J�|�,D�$7�$9�$9� :� :� :� �O� "� "�J�|�,D�c�#R�#R� S� S� S� �K� � �d� #� #� #���D�J��tr+c�v�|j���}|tvr$td|gdd�|_|j|_n�|dkr5|j�tddd���|j |_n�|tur]|j�tddd���|j�td d d���|j |_nQ|j�tdd d |id ���|j� |��|j |_dS)NrrF�r/rpr0rsr�r-z*expected-closing-tag-but-got-right-bracketr.z expected-closing-tag-but-got-eofr_�</z!expected-closing-tag-but-got-charr0r<T) rrGr rr$r�r"r1rHr!rrNr�r�s r*r�zHTMLTokenizer.closeTagOpenState�s`���{���!�!�� �<� � �)3�H�)=�t�)+�E�!C�!C�D� ��*�D�J�J� �S�[�[� �O� "� "�J�|�,D�$P�$R�$R� S� S� S���D�J�J� �S�[�[� �O� "� "�J�|�,D�$F�$H�$H� I� I� I� �O� "� "�J�|�,D�d�#S�#S� T� T� T���D�J�J� �O� "� "�J�|�,D�$G�17���$@�$@� A� A� A� �K� � �d� #� #� #��/�D�J��tr+c���|j���}|tvr |j|_n�|dkr|���n�|t ur5|j�tddd���|j |_nl|dkr |j |_nY|dkr>|j�tddd���|j dxxd z cc<n|j dxx|z cc<d S) Nr�r-zeof-in-tag-namer.r�r{r|rpr?T) rrGr �beforeAttributeNameStater"ryrr1rHrr!�selfClosingStartTagStater$r�s r*r�zHTMLTokenizer.tagNameState�s'���{���!�!�� �?� "� "��6�D�J�J� �S�[�[� � !� !� #� #� #� #� �S�[�[� �O� "� "�J�|�,D�$5�$7�$7� 8� 8� 8���D�J�J� �S�[�[��6�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� � �f� %� %� %�� 1� %� %� %� %� � �f� %� %� %�� -� %� %� %��tr+c��|j���}|dkrd|_|j|_nN|j�tddd���|j�|��|j |_dS�Nr�r:r_rXr.T) rrG�temporaryBuffer�rcdataEndTagOpenStater"r1rHrrNr�r�s r*r�z%HTMLTokenizer.rcdataLessThanSignState�sz���{���!�!�� �3�;�;�#%�D� ��3�D�J�J� �O� "� "�J�|�,D�c�#R�#R� S� S� S� �K� � �d� #� #� #��)�D�J��tr+c� �|j���}|tvr|xj|z c_|j|_nN|j�tddd���|j� |��|j |_dS�Nr_r�r.T) rrGr r��rcdataEndTagNameStater"r1rHrrNr�r�s r*r�z#HTMLTokenizer.rcdataEndTagOpenState�s����{���!�!�� �<� � � � � �D� (� � ��3�D�J�J� �O� "� "�J�|�,D�d�#S�#S� T� T� T� �K� � �d� #� #� #��)�D�J��tr+c���|jo9|jd���|j���k}|j���}|t vr+|r)t d|jgdd�|_|j|_n�|dkr+|r)t d|jgdd�|_|j |_n�|dkr?|r=t d|jgdd�|_|� ��|j |_np|tvr|xj|z c_nV|j �t dd|jzd ���|j�|��|j|_d S� NrprrFr�r�r�r_r�r.T)r$�lowerr�rrGr rr�r"r�ryr!r r1rHrNr��r'� appropriater0s r*r�z#HTMLTokenizer.rcdataEndTagNameState�s����'�m�D�,=�f�,E�,K�,K�,M�,M�QU�Qe�Qk�Qk�Qm�Qm�,m� ��{���!�!�� �?� "� "�{� "�)3�H�)=�)-�)=�)+�E�!C�!C�D� ��6�D�J�J� �S�[�[�[�[�)3�H�)=�)-�)=�)+�E�!C�!C�D� ��6�D�J�J� �S�[�[�[�[�)3�H�)=�)-�)=�)+�E�!C�!C�D� � � !� !� #� #� #���D�J�J� �\� !� !� � � �D� (� � � � �O� "� "�J�|�,D�,0�4�3G�,G�$I�$I� J� J� J� �K� � �d� #� #� #��)�D�J��tr+c��|j���}|dkrd|_|j|_nN|j�tddd���|j�|��|j |_dSr�) rrGr��rawtextEndTagOpenStater"r1rHrrNr�r�s r*r�z&HTMLTokenizer.rawtextLessThanSignState�sz���{���!�!�� �3�;�;�#%�D� ��4�D�J�J� �O� "� "�J�|�,D�c�#R�#R� S� S� S� �K� � �d� #� #� #��*�D�J��tr+c� �|j���}|tvr|xj|z c_|j|_nN|j�tddd���|j� |��|j |_dSr�) rrGr r��rawtextEndTagNameStater"r1rHrrNr�r�s r*r�z$HTMLTokenizer.rawtextEndTagOpenStates����{���!�!�� �<� � � � � �D� (� � ��4�D�J�J� �O� "� "�J�|�,D�d�#S�#S� T� T� T� �K� � �d� #� #� #��*�D�J��tr+c���|jo9|jd���|j���k}|j���}|t vr+|r)t d|jgdd�|_|j|_n�|dkr+|r)t d|jgdd�|_|j |_n�|dkr?|r=t d|jgdd�|_|� ��|j |_np|tvr|xj|z c_nV|j �t dd|jzd ���|j�|��|j|_d Sr�)r$r�r�rrGr rr�r"r�ryr!r r1rHrNr�r�s r*r�z$HTMLTokenizer.rawtextEndTagNameStates����'�m�D�,=�f�,E�,K�,K�,M�,M�QU�Qe�Qk�Qk�Qm�Qm�,m� ��{���!�!�� �?� "� "�{� "�)3�H�)=�)-�)=�)+�E�!C�!C�D� ��6�D�J�J� �S�[�[�[�[�)3�H�)=�)-�)=�)+�E�!C�!C�D� ��6�D�J�J� �S�[�[�[�[�)3�H�)=�)-�)=�)+�E�!C�!C�D� � � !� !� #� #� #���D�J�J� �\� !� !� � � �D� (� � � � �O� "� "�J�|�,D�,0�4�3G�,G�$I�$I� J� J� J� �K� � �d� #� #� #��*�D�J��tr+c�~�|j���}|dkrd|_|j|_n�|dkr5|j�tddd���|j|_nN|j�tddd���|j� |��|j |_dS) Nr�r:r�r_z<!r.rXT) rrGr��scriptDataEndTagOpenStater"r1rHr�scriptDataEscapeStartStaterNr�r�s r*r�z)HTMLTokenizer.scriptDataLessThanSignState,s����{���!�!�� �3�;�;�#%�D� ��7�D�J�J� �S�[�[� �O� "� "�J�|�,D�d�#S�#S� T� T� T��8�D�J�J� �O� "� "�J�|�,D�c�#R�#R� S� S� S� �K� � �d� #� #� #��-�D�J��tr+c� �|j���}|tvr|xj|z c_|j|_nN|j�tddd���|j� |��|j |_dSr�) rrGr r��scriptDataEndTagNameStater"r1rHrrNr�r�s r*r�z'HTMLTokenizer.scriptDataEndTagOpenState:s����{���!�!�� �<� � � � � �D� (� � ��7�D�J�J� �O� "� "�J�|�,D�d�#S�#S� T� T� T� �K� � �d� #� #� #��-�D�J��tr+c���|jo9|jd���|j���k}|j���}|t vr+|r)t d|jgdd�|_|j|_n�|dkr+|r)t d|jgdd�|_|j |_n�|dkr?|r=t d|jgdd�|_|� ��|j |_np|tvr|xj|z c_nV|j �t dd|jzd ���|j�|��|j|_d Sr�)r$r�r�rrGr rr�r"r�ryr!r r1rHrNr�r�s r*r�z'HTMLTokenizer.scriptDataEndTagNameStateEs����'�m�D�,=�f�,E�,K�,K�,M�,M�QU�Qe�Qk�Qk�Qm�Qm�,m� ��{���!�!�� �?� "� "�{� "�)3�H�)=�)-�)=�)+�E�!C�!C�D� ��6�D�J�J� �S�[�[�[�[�)3�H�)=�)-�)=�)+�E�!C�!C�D� ��6�D�J�J� �S�[�[�[�[�)3�H�)=�)-�)=�)+�E�!C�!C�D� � � !� !� #� #� #���D�J�J� �\� !� !� � � �D� (� � � � �O� "� "�J�|�,D�,0�4�3G�,G�$I�$I� J� J� J� �K� � �d� #� #� #��-�D�J��tr+c���|j���}|dkr5|j�tddd���|j|_n&|j�|��|j|_dS�N�-r_r.T) rrGr1rHr�scriptDataEscapeStartDashStater"rNr�r�s r*r�z(HTMLTokenizer.scriptDataEscapeStartStatea�r���{���!�!�� �3�;�;� �O� "� "�J�|�,D�c�#R�#R� S� S� S��<�D�J�J� �K� � �d� #� #� #��-�D�J��tr+c���|j���}|dkr5|j�tddd���|j|_n&|j�|��|j|_dSr�) rrGr1rHr�scriptDataEscapedDashDashStater"rNr�r�s r*r�z,HTMLTokenizer.scriptDataEscapeStartDashStatekr�r+c�<�|j���}|dkr5|j�tddd���|j|_n�|dkr |j|_n�|dkrQ|j�tddd���|j�tddd���n]|tkr |j |_nE|j� d ��}|j�td||zd���d S) Nr�r_r.rXr{r-r|r?)rXr�r{T) rrGr1rHr�scriptDataEscapedDashStater"�"scriptDataEscapedLessThanSignStaterr!r�r�s r*�scriptDataEscapedStatez$HTMLTokenizer.scriptDataEscapedStateus4���{���!�!�� �3�;�;� �O� "� "�J�|�,D�c�#R�#R� S� S� S��8�D�J�J� �S�[�[��@�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� �O� "� "�J�|�,D�,4�$6�$6� 7� 7� 7� 7� �S�[�[���D�J�J��K�*�*�+?�@�@�E� �O� "� "�J�|�,D�$(�5�L�$2�$2� 3� 3� 3��tr+c�2�|j���}|dkr5|j�tddd���|j|_n�|dkr |j|_n�|dkr]|j�tddd���|j�tddd���|j|_nL|tkr |j |_n4|j�td|d���|j|_d S) Nr�r_r.rXr{r-r|r?T) rrGr1rHrr�r"r�r�rr!r�s r*r�z(HTMLTokenizer.scriptDataEscapedDashState�s ���{���!�!�� �3�;�;� �O� "� "�J�|�,D�c�#R�#R� S� S� S��<�D�J�J� �S�[�[��@�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� �O� "� "�J�|�,D�,4�$6�$6� 7� 7� 7��4�D�J�J� �S�[�[���D�J�J� �O� "� "�J�|�,D�d�#S�#S� T� T� T��4�D�J��tr+c��|j���}|dkr)|j�tddd���n�|dkr |j|_n�|dkr5|j�tddd���|j|_n�|dkr]|j�tddd���|j�tdd d���|j|_nL|tkr |j |_n4|j�td|d���|j|_d S) Nr�r_r.rXr�r{r-r|r?T) rrGr1rHrr�r"r�r�rr!r�s r*r�z,HTMLTokenizer.scriptDataEscapedDashDashState�sO���{���!�!�� �3�;�;� �O� "� "�J�|�,D�c�#R�#R� S� S� S� S� �S�[�[��@�D�J�J� �S�[�[� �O� "� "�J�|�,D�c�#R�#R� S� S� S��-�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� �O� "� "�J�|�,D�,4�$6�$6� 7� 7� 7��4�D�J�J� �S�[�[���D�J�J� �O� "� "�J�|�,D�d�#S�#S� T� T� T��4�D�J��tr+c��|j���}|dkrd|_|j|_n�|t vr?|j�tdd|zd���||_|j |_nN|j�tddd���|j� |��|j |_dSr�) rrGr�� scriptDataEscapedEndTagOpenStater"r r1rHr� scriptDataDoubleEscapeStartStaterNr�r�s r*r�z0HTMLTokenizer.scriptDataEscapedLessThanSignState�s����{���!�!�� �3�;�;�#%�D� ��>�D�J�J� �\� !� !� �O� "� "�J�|�,D�c�TX�j�#Y�#Y� Z� Z� Z�#'�D� ��>�D�J�J� �O� "� "�J�|�,D�c�#R�#R� S� S� S� �K� � �d� #� #� #��4�D�J��tr+c��|j���}|tvr||_|j|_nN|j�tddd���|j� |��|j |_dSr�) rrGr r�� scriptDataEscapedEndTagNameStater"r1rHrrNr�r�s r*r�z.HTMLTokenizer.scriptDataEscapedEndTagOpenState�s|���{���!�!�� �<� � �#'�D� ��>�D�J�J� �O� "� "�J�|�,D�d�#S�#S� T� T� T� �K� � �d� #� #� #��4�D�J��tr+c���|jo9|jd���|j���k}|j���}|t vr+|r)t d|jgdd�|_|j|_n�|dkr+|r)t d|jgdd�|_|j |_n�|dkr?|r=t d|jgdd�|_|� ��|j |_np|tvr|xj|z c_nV|j �t dd|jzd ���|j�|��|j|_d Sr�)r$r�r�rrGr rr�r"r�ryr!r r1rHrNr�r�s r*r�z.HTMLTokenizer.scriptDataEscapedEndTagNameState�s����'�m�D�,=�f�,E�,K�,K�,M�,M�QU�Qe�Qk�Qk�Qm�Qm�,m� ��{���!�!�� �?� "� "�{� "�)3�H�)=�)-�)=�)+�E�!C�!C�D� ��6�D�J�J� �S�[�[�[�[�)3�H�)=�)-�)=�)+�E�!C�!C�D� ��6�D�J�J� �S�[�[�[�[�)3�H�)=�)-�)=�)+�E�!C�!C�D� � � !� !� #� #� #���D�J�J� �\� !� !� � � �D� (� � � � �O� "� "�J�|�,D�,0�4�3G�,G�$I�$I� J� J� J� �K� � �d� #� #� #��4�D�J��tr+c���|j���}|ttd��zvr_|j�t d|d���|j���dkr |j |_ nu|j |_ nh|tvr9|j�t d|d���|xj|z c_n&|j� |��|j |_ dS�N)r�r�r_r.�scriptT)rrGr rKr1rHrr�r��scriptDataDoubleEscapedStater"r�r rNr�s r*r�z.HTMLTokenizer.scriptDataDoubleEscapeStartState�s����{���!�!�� �O�i� �&;�&;�;� <� <� �O� "� "�J�|�,D�d�#S�#S� T� T� T��#�)�)�+�+�x�7�7�!�>�� � �!�8�� � � �\� !� !� �O� "� "�J�|�,D�d�#S�#S� T� T� T� � � �D� (� � � � �K� � �d� #� #� #��4�D�J��tr+c��|j���}|dkr5|j�tddd���|j|_n�|dkr5|j�tddd���|j|_n�|dkrQ|j�tddd���|j�tddd���nh|tkr5|j�tdd d���|j |_n(|j�td|d���d S� Nr�r_r.rXr{r-r|r?�eof-in-script-in-scriptT) rrGr1rHr� scriptDataDoubleEscapedDashStater"�(scriptDataDoubleEscapedLessThanSignStaterr!r�s r*r�z*HTMLTokenizer.scriptDataDoubleEscapedState�sc���{���!�!�� �3�;�;� �O� "� "�J�|�,D�c�#R�#R� S� S� S��>�D�J�J� �S�[�[� �O� "� "�J�|�,D�c�#R�#R� S� S� S��F�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� �O� "� "�J�|�,D�,4�$6�$6� 7� 7� 7� 7� �S�[�[� �O� "� "�J�|�,D�$=�$?�$?� @� @� @���D�J�J� �O� "� "�J�|�,D�d�#S�#S� T� T� T��tr+c���|j���}|dkr6|j�tddd���|j|_�n|dkr5|j�tddd���|j|_n�|dkr]|j�tddd���|j�tddd���|j|_nt|tkr5|j�tdd d���|j |_n4|j�td|d���|j|_d Sr�) rrGr1rHr�$scriptDataDoubleEscapedDashDashStater"r�r�rr!r�s r*r�z.HTMLTokenizer.scriptDataDoubleEscapedDashStatest���{���!�!�� �3�;�;� �O� "� "�J�|�,D�c�#R�#R� S� S� S��B�D�J�J� �S�[�[� �O� "� "�J�|�,D�c�#R�#R� S� S� S��F�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� �O� "� "�J�|�,D�,4�$6�$6� 7� 7� 7��:�D�J�J� �S�[�[� �O� "� "�J�|�,D�$=�$?�$?� @� @� @���D�J�J� �O� "� "�J�|�,D�d�#S�#S� T� T� T��:�D�J��tr+c�4�|j���}|dkr*|j�tddd����nN|dkr6|j�tddd���|j|_�n|dkr5|j�tddd���|j|_n�|dkr]|j�tddd���|j�tdd d���|j|_nt|tkr5|j�tdd d���|j |_n4|j�td|d���|j|_d S) Nr�r_r.rXr�r{r-r|r?r�T) rrGr1rHrr�r"r�r�rr!r�s r*r�z2HTMLTokenizer.scriptDataDoubleEscapedDashDashState%s����{���!�!�� �3�;�;� �O� "� "�J�|�,D�c�#R�#R� S� S� S� S� �S�[�[� �O� "� "�J�|�,D�c�#R�#R� S� S� S��F�D�J�J� �S�[�[� �O� "� "�J�|�,D�c�#R�#R� S� S� S��-�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� �O� "� "�J�|�,D�,4�$6�$6� 7� 7� 7��:�D�J�J� �S�[�[� �O� "� "�J�|�,D�$=�$?�$?� @� @� @���D�J�J� �O� "� "�J�|�,D�d�#S�#S� T� T� T��:�D�J��tr+c��|j���}|dkr<|j�tddd���d|_|j|_n&|j�|��|j |_dS)Nr�r_r.r:T) rrGr1rHrr��scriptDataDoubleEscapeEndStater"rNr�r�s r*r�z6HTMLTokenizer.scriptDataDoubleEscapedLessThanSignState>sz���{���!�!�� �3�;�;� �O� "� "�J�|�,D�c�#R�#R� S� S� S�#%�D� ��<�D�J�J� �K� � �d� #� #� #��:�D�J��tr+c���|j���}|ttd��zvr_|j�t d|d���|j���dkr |j |_ nu|j |_ nh|tvr9|j�t d|d���|xj|z c_n&|j� |��|j |_ dSr�)rrGr rKr1rHrr�r�r�r"r�r rNr�s r*r�z,HTMLTokenizer.scriptDataDoubleEscapeEndStateIs����{���!�!�� �O�i� �&;�&;�;� <� <� �O� "� "�J�|�,D�d�#S�#S� T� T� T��#�)�)�+�+�x�7�7�!�8�� � �!�>�� � � �\� !� !� �O� "� "�J�|�,D�d�#S�#S� T� T� T� � � �D� (� � � � �K� � �d� #� #� #��:�D�J��tr+c��|j���}|tvr"|j�td���n�|tvr0|jd�|dg��|j|_�nT|dkr|� ���n8|dkr|j |_�n$|dvrW|j �tddd ���|jd�|dg��|j|_n�|d krW|j �tdd d ���|jd�d dg��|j|_nl|tur5|j �tdd d ���|j|_n.|jd�|dg��|j|_dS)NTr0r:r�r�)�'�"r]rXr-�#invalid-character-in-attribute-namer.r{r|r?z#expected-attribute-name-but-got-eof)rrGr r�r r$rH�attributeNameStater"ryr�r1rrr!r�s r*r�z&HTMLTokenizer.beforeAttributeNameStateYs����{���!�!�� �?� "� "� �K� "� "�?�D� 9� 9� 9� 9� �\� !� !� � �f� %� ,� ,�d�B�Z� 8� 8� 8��0�D�J�J� �S�[�[� � !� !� #� #� #� #� �S�[�[��6�D�J�J� �)� )� )� �O� "� "�J�|�,D�$I�$K�$K� L� L� L� � �f� %� ,� ,�d�B�Z� 8� 8� 8��0�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� � �f� %� ,� ,�h��^� <� <� <��0�D�J�J� �S�[�[� �O� "� "�J�|�,D�$I�$K�$K� L� L� L���D�J�J� � �f� %� ,� ,�d�B�Z� 8� 8� 8��0�D�J��tr+c���|j���}d}d}|dkr|j|_�n�|tvrF|jdddxx||j�td��zz cc<d}�n8|dkrd}�n.|tvr|j|_�n|dkr|j |_�n|d krL|j � td d d ���|jdddxxd z cc<d}n�|dvrL|j � td dd ���|jdddxx|z cc<d}na|tur5|j � td dd ���|j|_n#|jdddxx|z cc<d}|r�|jddd�t ��|jddd<|jddd�D]L\}}|jddd|kr*|j � td dd ���n�M|r|���dS)NTFr]r0rZrr�r�r{r-r|r.r?�r�r�rXr�zeof-in-attribute-namezduplicate-attribute)rrG�beforeAttributeValueStater"r r$r�r �afterAttributeNameStater�r1rHrrr!rtrry)r'r0�leavingThisState� emitTokenrp�_s r*r�z HTMLTokenizer.attributeNameStatews���{���!�!����� � �3�;�;��7�D�J�J� �\� !� !� � �f� %�b� )�!� ,� ,� ,��� �&�&�|�T�:�:�1;� ;� ,� ,� ,�$� � � �S�[�[��I�I� �_� $� $��5�D�J�J� �S�[�[��6�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� � �f� %�b� )�!� ,� ,� ,�� 8� ,� ,� ,�$� � � �_� $� $� �O� "� "�J�|�,D�$I�$K�$K� L� L� L� � �f� %�b� )�!� ,� ,� ,�� 4� ,� ,� ,�$� � � �S�[�[� �O� "� "�J�|�,D�,C�$E�$E� F� F� F���D�J�J� � �f� %�b� )�!� ,� ,� ,�� 4� ,� ,� ,�$� � � (� �!�&�)�"�-�a�0�:�:�;K�L�L� � �f� %�b� )�!� ,��,�V�4�S�b�S�9� � ���a��$�V�,�R�0��3�t�;�;��O�*�*�J�|�4L�,A�,C�,C�D�D�D��E�<� � (��%�%�'�'�'��tr+c���|j���}|tvr"|j�td���n�|dkr|j|_�n�|dkr|����nq|tvr0|jd� |dg��|j |_�n8|dkr|j |_�n$|dkrW|j � tdd d ���|jd� d dg��|j |_n�|d vrW|j � tdd d ���|jd� |dg��|j |_nl|tur5|j � tddd ���|j|_n.|jd� |dg��|j |_dS)NTr]r�r0r:r�r{r-r|r.r?r�z&invalid-character-after-attribute-namezexpected-end-of-tag-but-got-eof)rrGr r�r�r"ryr r$rHr�r�r1rrr!r�s r*r�z%HTMLTokenizer.afterAttributeNameState�s���{���!�!�� �?� "� "� �K� "� "�?�D� 9� 9� 9� 9� �S�[�[��7�D�J�J� �S�[�[� � !� !� #� #� #� #� �\� !� !� � �f� %� ,� ,�d�B�Z� 8� 8� 8��0�D�J�J� �S�[�[��6�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� � �f� %� ,� ,�h��^� <� <� <��0�D�J�J� �_� $� $� �O� "� "�J�|�,D�$L�$N�$N� O� O� O� � �f� %� ,� ,�d�B�Z� 8� 8� 8��0�D�J�J� �S�[�[� �O� "� "�J�|�,D�$E�$G�$G� H� H� H���D�J�J� � �f� %� ,� ,�d�B�Z� 8� 8� 8��0�D�J��tr+c��|j���}|tvr"|j�td���n�|dkr|j|_�n�|dkr(|j|_|j�|���ny|dkr|j|_�ne|dkr>|j � tddd���|� ���n!|d krV|j � tdd d���|j d d d xxdz cc<|j|_n�|dvrV|j � tddd���|j d d d xx|z cc<|j|_nk|tur5|j � tddd���|j|_n-|j d d d xx|z cc<|j|_dS)NTr�rWr�r�r-z.expected-attribute-value-but-got-right-bracketr.r{r|r0rZr r?)r]rX�`z"equals-in-unquoted-attribute-valuez$expected-attribute-value-but-got-eof)rrGr r��attributeValueDoubleQuotedStater"�attributeValueUnQuotedStaterN�attributeValueSingleQuotedStater1rHrryr$rr!r�s r*r�z'HTMLTokenizer.beforeAttributeValueState�sG���{���!�!�� �?� "� "� �K� "� "�?�D� 9� 9� 9� 9� �T�\�\��=�D�J�J� �S�[�[��9�D�J� �K� � �d� #� #� #� #� �S�[�[��=�D�J�J� �S�[�[� �O� "� "�J�|�,D�$T�$V�$V� W� W� W� � !� !� #� #� #� #� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� � �f� %�b� )�!� ,� ,� ,�� 8� ,� ,� ,��9�D�J�J� �_� $� $� �O� "� "�J�|�,D�$H�$J�$J� K� K� K� � �f� %�b� )�!� ,� ,� ,�� 4� ,� ,� ,��9�D�J�J� �S�[�[� �O� "� "�J�|�,D�$J�$L�$L� M� M� M���D�J�J� � �f� %�b� )�!� ,� ,� ,�� 4� ,� ,� ,��9�D�J��tr+c�*�|j���}|dkr |j|_n�|dkr|�d��n�|dkrJ|j�tddd���|jddd xxd z cc<nz|tur5|j�tdd d���|j |_n<|jddd xx||j� d ��zz cc<d S)Nr�rWr{r-r|r.r0rZr r?z#eof-in-attribute-value-double-quote)r�rWr{T� rrG�afterAttributeValueStater"rnr1rHrr$rr!r�r�s r*r�z-HTMLTokenizer.attributeValueDoubleQuotedState�sD���{���!�!�� �4�<�<��6�D�J�J� �S�[�[� � )� )�#� .� .� .� .� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� � �f� %�b� )�!� ,� ,� ,�� 8� ,� ,� ,� ,� �S�[�[� �O� "� "�J�|�,D�$I�$K�$K� L� L� L���D�J�J� � �f� %�b� )�!� ,� ,� ,��� �&�&�'<�=�=�1>� >� ,� ,� ,��tr+c�*�|j���}|dkr |j|_n�|dkr|�d��n�|dkrJ|j�tddd���|jddd xxd z cc<nz|tur5|j�tdd d���|j |_n<|jddd xx||j� d ��zz cc<d S)Nr�rWr{r-r|r.r0rZr r?z#eof-in-attribute-value-single-quote)r�rWr{Tr�r�s r*r�z-HTMLTokenizer.attributeValueSingleQuotedStatesD���{���!�!�� �3�;�;��6�D�J�J� �S�[�[� � )� )�#� .� .� .� .� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� � �f� %�b� )�!� ,� ,� ,�� 8� ,� ,� ,� ,� �S�[�[� �O� "� "�J�|�,D�$I�$K�$K� L� L� L���D�J�J� � �f� %�b� )�!� ,� ,� ,��� �&�&�';�<�<�1=� =� ,� ,� ,��tr+c �2�|j���}|tvr|j|_�nf|dkr|�d���nI|dkr|����n-|dvrJ|j�tddd���|j ddd xx|z cc<n�|d krJ|j�tdd d���|j ddd xxd z cc<n�|tur5|j�tdd d���|j |_nQ|j ddd xx||j� td��tz��zz cc<dS)NrWr�)r�r�r]rXr�r-z0unexpected-character-in-unquoted-attribute-valuer.r0rZr r{r|r?z eof-in-attribute-value-no-quotes)rWr�r�r�r]rXr�r{T)rrGr r�r"rnryr1rHrr$rr!r�rKr�s r*r�z)HTMLTokenizer.attributeValueUnQuotedStates����{���!�!�� �?� "� "��6�D�J�J� �S�[�[� � )� )�#� .� .� .� .� �S�[�[� � !� !� #� #� #� #� �.� .� .� �O� "� "�J�|�,D�$V�$X�$X� Y� Y� Y� � �f� %�b� )�!� ,� ,� ,�� 4� ,� ,� ,� ,� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� � �f� %�b� )�!� ,� ,� ,�� 8� ,� ,� ,� ,� �S�[�[� �O� "� "�J�|�,D�$F�$H�$H� I� I� I���D�J�J� � �f� %�b� )�!� ,� ,� ,��t�{�7M�7M��G�H�H�?�Z�8\�8\�1\� \� ,� ,� ,��tr+c� �|j���}|tvr |j|_n�|dkr|���n�|dkr |j|_n�|turO|j� tddd���|j� |��|j |_nN|j� tddd���|j� |��|j|_dS)Nr�r�r-z$unexpected-EOF-after-attribute-valuer.z*unexpected-character-after-attribute-valueT) rrGr r�r"ryr�rr1rHrrNr!r�s r*r�z&HTMLTokenizer.afterAttributeValueState.s���{���!�!�� �?� "� "��6�D�J�J� �S�[�[� � !� !� #� #� #� #� �S�[�[��6�D�J�J� �S�[�[� �O� "� "�J�|�,D�$J�$L�$L� M� M� M� �K� � �d� #� #� #���D�J�J� �O� "� "�J�|�,D�$P�$R�$R� S� S� S� �K� � �d� #� #� #��6�D�J��tr+c���|j���}|dkrd|jd<|���n�|turO|j�tddd���|j�|��|j |_ nN|j�tddd���|j�|��|j |_ dS)Nr�Trsr-z#unexpected-EOF-after-solidus-in-tagr.z)unexpected-character-after-solidus-in-tag) rrGr$ryrr1rHrrNr!r"r�r�s r*r�z&HTMLTokenizer.selfClosingStartTagStateBs����{���!�!�� �3�;�;�/3�D� �m� ,� � !� !� #� #� #� #� �S�[�[� �O� "� "�J�|�,D�$I�$K�$K� L� L� L� �K� � �d� #� #� #���D�J�J� �O� "� "�J�|�,D�$O�$Q�$Q� R� R� R� �K� � �d� #� #� #��6�D�J��tr+c��|j�d��}|�dd��}|j�t d|d���|j���|j|_dS)Nr�r{r?�Commentr.T) rr��replacer1rHrrGr!r"r�s r*r�zHTMLTokenizer.bogusCommentStateTsz���{�%�%�c�*�*���|�|�H�h�/�/�� ����� �*�D� 9� 9� ;� ;� ;� � �������^�� ��tr+c��|j���g}|ddkr]|�|j�����|ddkr#tddd�|_|j|_dS�n|ddvrjd}dD]<}|�|j�����|d|vrd }n�=|r&td ddddd �|_|j|_dSn�|dd kr�|j��|jj j r�|jj j dj |jj j krSd}d D]>}|�|j�����|d|krd }n�?|r|j |_dS|j�tddd���|r.|j�|�����|�.|j|_dS)NrZr�r�r:r.T)�d�D))�o�O�rS�C��t�T��y�Y��p�P��e�EF�Doctype)r/rp�publicId�systemId�correct�[)r�r��Arrrr-zexpected-dashes-or-doctype)rrGrHrr$�commentStartStater"� doctypeStater�tree� openElements� namespace�defaultNamespace�cdataSectionStater1rNr3r�)r'rR�matched�expecteds r*r�z(HTMLTokenizer.markupDeclarationOpenStatecsS���[�%�%�'�'�(� � �R�=�C� � � � � �T�[�-�-�/�/� 0� 0� 0���}��#�#�-7� �-B�B�$O�$O��!�!�3�� ��t�$��r�]�j� (� (��G�A� � ��� � ���!1�!1�!3�!3�4�4�4��R�=��0�0�#�G��E�1�� �-7� �-B�-/�15�4�04�%6�%6��!�"�.�� ��t�  ���m�s�"�"��k�%��k��+�&��k��+�B�/�9�T�[�=M�=^�^�^��G�:� � ��� � ���!1�!1�!3�!3�4�4�4��R�=�H�,�,�#�G��E�-�� �!�3�� ��t� ���� �<�(@� <� >� >� ?� ?� ?�� /� �K� � �i�m�m�o�o� .� .� .�� /��+�� ��tr+c��|j���}|dkr|j|_�n|dkr>|j�t ddd���|jdxxdz cc<n�|dkrT|j�t dd d���|j�|j��|j|_n~|turT|j�t dd d���|j�|j��|j|_n!|jdxx|z cc<|j |_d S) Nr�r{r-r|r.r0r?r��incorrect-comment�eof-in-commentT) rrG�commentStartDashStater"r1rHrr$r!r� commentStater�s r*rzHTMLTokenizer.commentStartState�sn���{���!�!�� �3�;�;��3�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� � �f� %� %� %�� 1� %� %� %� %� �S�[�[� �O� "� "�J�|�,D�$7�$9�$9� :� :� :� �O� "� "�4�#4� 5� 5� 5���D�J�J� �S�[�[� �O� "� "�J�|�,D�$4�$6�$6� 7� 7� 7� �O� "� "�4�#4� 5� 5� 5���D�J�J� � �f� %� %� %�� -� %� %� %��*�D�J��tr+c��|j���}|dkr|j|_�n|dkr>|j�t ddd���|jdxxdz cc<n�|dkrT|j�t dd d���|j�|j��|j|_n�|turT|j�t dd d���|j�|j��|j|_n$|jdxxd|zz cc<|j |_d S) Nr�r{r-r|r.r0�-�r�rrT) rrG�commentEndStater"r1rHrr$r!rrr�s r*rz#HTMLTokenizer.commentStartDashState�sr���{���!�!�� �3�;�;��-�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� � �f� %� %� %�� 2� %� %� %� %� �S�[�[� �O� "� "�J�|�,D�$7�$9�$9� :� :� :� �O� "� "�4�#4� 5� 5� 5���D�J�J� �S�[�[� �O� "� "�J�|�,D�$4�$6�$6� 7� 7� 7� �O� "� "�4�#4� 5� 5� 5���D�J�J� � �f� %� %� %��t�� 3� %� %� %��*�D�J��tr+c��|j���}|dkr |j|_n�|dkr>|j�t ddd���|jdxxdz cc<n�|turT|j�t ddd���|j�|j��|j |_n0|jdxx||j� d ��zz cc<d S) Nr�r{r-r|r.r0r?r)r�r{T) rrG�commentEndDashStater"r1rHrr$rr!r�r�s r*rzHTMLTokenizer.commentState�s#���{���!�!�� �3�;�;��1�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� � �f� %� %� %�� 1� %� %� %� %� �S�[�[� �O� "� "�J�|�,D�,<�$>�$>� ?� ?� ?� �O� "� "�4�#4� 5� 5� 5���D�J�J� � �f� %� %� %��� �&�&��7�7�*8� 8� %� %� %��tr+c��|j���}|dkr |j|_n�|dkrJ|j�t ddd���|jdxxdz cc<|j|_n�|turT|j�t ddd���|j�|j��|j |_n$|jdxxd|zz cc<|j|_d S) Nr�r{r-r|r.r0r!zeof-in-comment-end-dashT) rrGr"r"r1rHrr$rrr!r�s r*r$z!HTMLTokenizer.commentEndDashState�s#���{���!�!�� �3�;�;��-�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� � �f� %� %� %�� 2� %� %� %��*�D�J�J� �S�[�[� �O� "� "�J�|�,D�$=�$?�$?� @� @� @� �O� "� "�4�#4� 5� 5� 5���D�J�J� � �f� %� %� %��t�� 3� %� %� %��*�D�J��tr+c��|j���}|dkr-|j�|j��|j|_�ny|dkrK|j�tddd���|jdxxdz cc<|j|_�n(|dkr5|j�tdd d���|j |_n�|d kr>|j�tdd d���|jdxx|z cc<n�|turT|j�tdd d���|j�|j��|j|_nL|j�tdd d���|jdxxd|zz cc<|j|_dS)Nr�r{r-r|r.r0u--�r�z,unexpected-bang-after-double-dash-in-commentr�z,unexpected-dash-after-double-dash-in-commentzeof-in-comment-double-dashzunexpected-char-in-commentz--T) rrGr1rHr$r!r"rr�commentEndBangStaterr�s r*r"zHTMLTokenizer.commentEndState�s���{���!�!�� �3�;�;� �O� "� "�4�#4� 5� 5� 5���D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� � �f� %� %� %�� 3� %� %� %��*�D�J�J� �S�[�[� �O� "� "�J�|�,D�$R�$T�$T� U� U� U��1�D�J�J� �S�[�[� �O� "� "�J�|�,D�$R�$T�$T� U� U� U� � �f� %� %� %�� -� %� %� %� %� �S�[�[� �O� "� "�J�|�,D�$@�$B�$B� C� C� C� �O� "� "�4�#4� 5� 5� 5���D�J�J� �O� "� "�J�|�,D�$@�$B�$B� C� C� C� � �f� %� %� %���� 4� %� %� %��*�D�J��tr+c��|j���}|dkr,|j�|j��|j|_n�|dkr"|jdxxdz cc<|j|_n�|dkrJ|j�tddd���|jdxxd z cc<|j |_n�|turT|j�tdd d���|j�|j��|j|_n$|jdxxd|zz cc<|j |_d S) Nr�r�r0z--!r{r-r|r.u--!�zeof-in-comment-end-bang-stateT) rrGr1rHr$r!r"r$rrrr�s r*r'z!HTMLTokenizer.commentEndBangStatesq���{���!�!�� �3�;�;� �O� "� "�4�#4� 5� 5� 5���D�J�J� �S�[�[� � �f� %� %� %�� .� %� %� %��1�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� � �f� %� %� %�� 4� %� %� %��*�D�J�J� �S�[�[� �O� "� "�J�|�,D�$C�$E�$E� F� F� F� �O� "� "�4�#4� 5� 5� 5���D�J�J� � �f� %� %� %���� 5� %� %� %��*�D�J��tr+c���|j���}|tvr |j|_n�|t ur^|j�tddd���d|j d<|j�|j ��|j |_nN|j�tddd���|j� |��|j|_dS)Nr-�!expected-doctype-name-but-got-eofr.Frzneed-space-after-doctypeT) rrGr �beforeDoctypeNameStater"rr1rHrr$r!rNr�s r*rzHTMLTokenizer.doctypeStates����{���!�!�� �?� "� "��4�D�J�J� �S�[�[� �O� "� "�J�|�,D�$G�$I�$I� J� J� J�+0�D� �i� (� �O� "� "�4�#4� 5� 5� 5���D�J�J� �O� "� "�J�|�,D�$>�$@�$@� A� A� A� �K� � �d� #� #� #��4�D�J��tr+c��|j���}|tvr�n&|dkr^|j�t ddd���d|jd<|j�|j��|j|_n�|dkr?|j�t ddd���d |jd <|j |_n}|tur^|j�t dd d���d|jd<|j�|j��|j|_n||jd <|j |_d S) Nr�r-z+expected-doctype-name-but-got-right-bracketr.Frr{r|r?rpr*T) rrGr r1rHrr$r!r"�doctypeNameStaterr�s r*r+z$HTMLTokenizer.beforeDoctypeNameState*sp���{���!�!�� �?� "� "� � �S�[�[� �O� "� "�J�|�,D�$Q�$S�$S� T� T� T�+0�D� �i� (� �O� "� "�4�#4� 5� 5� 5���D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B�(0�D� �f� %��.�D�J�J� �S�[�[� �O� "� "�J�|�,D�$G�$I�$I� J� J� J�+0�D� �i� (� �O� "� "�4�#4� 5� 5� 5���D�J�J�(,�D� �f� %��.�D�J��tr+c�p�|j���}|tvr;|jd�t ��|jd<|j|_�nX|dkrY|jd�t ��|jd<|j� |j��|j |_n�|dkrJ|j� tddd���|jdxxdz cc<|j |_n�|tur�|j� tddd���d |jd <|jd�t ��|jd<|j� |j��|j |_n|jdxx|z cc<d S) Nrpr�r{r-r|r.r?zeof-in-doctype-nameFrT)rrGr r$rtr�afterDoctypeNameStater"r1rHr!rr-rr�s r*r-zHTMLTokenizer.doctypeNameStateDs����{���!�!�� �?� "� "�(,�(9�&�(A�(K�(K�L\�(]�(]�D� �f� %��3�D�J�J� �S�[�[�(,�(9�&�(A�(K�(K�L\�(]�(]�D� �f� %� �O� "� "�4�#4� 5� 5� 5���D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� � �f� %� %� %�� 1� %� %� %��.�D�J�J� �S�[�[� �O� "� "�J�|�,D�$9�$;�$;� <� <� <�+0�D� �i� (�(,�(9�&�(A�(K�(K�L\�(]�(]�D� �f� %� �O� "� "�4�#4� 5� 5� 5���D�J�J� � �f� %� %� %�� -� %� %� %��tr+c�^�|j���}|tvr�n�|dkr-|j�|j��|j|_�nU|turxd|jd<|j� |��|j�tddd���|j�|j��|j|_n�|dvr9d}d D]#}|j���}||vrd}n�$|r|j |_dSn<|d vr8d}d D]#}|j���}||vrd}n�$|r|j |_dS|j� |��|j�tdd d |id���d|jd<|j |_dS)Nr�Frr-�eof-in-doctyper.rT))�u�U)�b�B)�l�L)�i�Ir���s�S)rr:rr )�m�Mz*expected-space-or-right-bracket-in-doctyper0r<)rrGr r1rHr$r!r"rrNr�afterDoctypePublicKeywordState�afterDoctypeSystemKeywordState�bogusDoctypeState)r'r0rrs r*r/z#HTMLTokenizer.afterDoctypeNameState]s���{���!�!�� �?� "� "� � �S�[�[� �O� "� "�4�#4� 5� 5� 5���D�J�J� �S�[�[�+0�D� �i� (� �K� � �d� #� #� #� �O� "� "�J�|�,D�$4�$6�$6� 7� 7� 7� �O� "� "�4�#4� 5� 5� 5���D�J�J��z�!�!���!9���H��;�+�+�-�-�D��8�+�+�"'����,�� �!%�!D�D�J��4� ���#�#���!9���H��;�+�+�-�-�D��8�+�+�"'����,�� �!%�!D�D�J��4� �K� � �d� #� #� #� �O� "� "�J�|�,D�$P�%+�T�N�$4�$4� 5� 5� 5�,1�D� �i� (��/�D�J��tr+c�$�|j���}|tvr |j|_n�|dvrO|j�tddd���|j�|��|j|_n�|tur^|j�tddd���d|j d<|j�|j ��|j |_n&|j�|��|j|_dS� N)r�r�r-�unexpected-char-in-doctyper.r1FrT) rrGr �"beforeDoctypePublicIdentifierStater"r1rHrrNrr$r!r�s r*r?z,HTMLTokenizer.afterDoctypePublicKeywordState�����{���!�!�� �?� "� "��@�D�J�J� �Z� � � �O� "� "�J�|�,D�$@�$B�$B� C� C� C� �K� � �d� #� #� #��@�D�J�J� �S�[�[� �O� "� "�J�|�,D�$4�$6�$6� 7� 7� 7�+0�D� �i� (� �O� "� "�4�#4� 5� 5� 5���D�J�J� �K� � �d� #� #� #��@�D�J��tr+c���|j���}|tvr�nE|dkrd|jd<|j|_�n'|dkrd|jd<|j|_�n |dkr^|j�tddd���d |jd <|j�|j��|j |_n�|tur^|j�tdd d���d |jd <|j�|j��|j |_n>|j�tdd d���d |jd <|j |_d S)Nr�r:r r�r�r-�unexpected-end-of-doctyper.Frr1rDT) rrGr r$�(doctypePublicIdentifierDoubleQuotedStater"�(doctypePublicIdentifierSingleQuotedStater1rHrr!rrAr�s r*rEz0HTMLTokenizer.beforeDoctypePublicIdentifierState�s����{���!�!�� �?� "� "� � �T�\�\�,.�D� �j� )��F�D�J�J� �S�[�[�,.�D� �j� )��F�D�J�J� �S�[�[� �O� "� "�J�|�,D�$?�$A�$A� B� B� B�+0�D� �i� (� �O� "� "�4�#4� 5� 5� 5���D�J�J� �S�[�[� �O� "� "�J�|�,D�$4�$6�$6� 7� 7� 7�+0�D� �i� (� �O� "� "�4�#4� 5� 5� 5���D�J�J� �O� "� "�J�|�,D�$@�$B�$B� C� C� C�+0�D� �i� (��/�D�J��tr+c��|j���}|dkr|j|_�n$|dkr>|j�t ddd���|jdxxdz cc<n�|dkr^|j�t dd d���d |jd <|j�|j��|j|_n||tur^|j�t dd d���d |jd <|j�|j��|j|_n|jdxx|z cc<d S)Nr�r{r-r|r.r r?r�rHFrr1T� rrG�!afterDoctypePublicIdentifierStater"r1rHrr$r!rr�s r*rIz6HTMLTokenizer.doctypePublicIdentifierDoubleQuotedState�����{���!�!�� �4�<�<��?�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� � �j� )� )� )�X� 5� )� )� )� )� �S�[�[� �O� "� "�J�|�,D�$?�$A�$A� B� B� B�+0�D� �i� (� �O� "� "�4�#4� 5� 5� 5���D�J�J� �S�[�[� �O� "� "�J�|�,D�$4�$6�$6� 7� 7� 7�+0�D� �i� (� �O� "� "�4�#4� 5� 5� 5���D�J�J� � �j� )� )� )�T� 1� )� )� )��tr+c��|j���}|dkr|j|_�n$|dkr>|j�t ddd���|jdxxdz cc<n�|dkr^|j�t dd d���d |jd <|j�|j��|j|_n||tur^|j�t dd d���d |jd <|j�|j��|j|_n|jdxx|z cc<d S)Nr�r{r-r|r.r r?r�rHFrr1TrLr�s r*rJz6HTMLTokenizer.doctypePublicIdentifierSingleQuotedState�����{���!�!�� �3�;�;��?�D�J�J� �X� � � �O� "� "�J�|�,D�,?�$A�$A� B� B� B� � �j� )� )� )�X� 5� )� )� )� )� �S�[�[� �O� "� "�J�|�,D�$?�$A�$A� B� B� B�+0�D� �i� (� �O� "� "�4�#4� 5� 5� 5���D�J�J� �S�[�[� �O� "� "�J�|�,D�$4�$6�$6� 7� 7� 7�+0�D� �i� (� �O� "� "�4�#4� 5� 5� 5���D�J�J� � �j� )� )� )�T� 1� )� )� )��tr+c�*�|j���}|tvr|j|_�nb|dkr-|j�|j��|j|_�n/|dkr?|j�tddd���d|jd<|j |_n�|dkr?|j�tddd���d|jd<|j |_n�|tur^|j�tdd d���d |jd <|j�|j��|j|_n>|j�tddd���d |jd <|j |_d S) Nr�r�r-rDr.r:rr�r1FrT)rrGr �-betweenDoctypePublicAndSystemIdentifiersStater"r1rHr$r!r�(doctypeSystemIdentifierDoubleQuotedState�(doctypeSystemIdentifierSingleQuotedStaterrAr�s r*rMz/HTMLTokenizer.afterDoctypePublicIdentifierState�s����{���!�!�� �?� "� "��K�D�J�J� �S�[�[� �O� "� "�4�#4� 5� 5� 5���D�J�J� �S�[�[� �O� "� "�J�|�,D�$@�$B�$B� C� C� C�,.�D� �j� )��F�D�J�J� �S�[�[� �O� "� "�J�|�,D�$@�$B�$B� C� C� C�,.�D� �j� )��F�D�J�J� �S�[�[� �O� "� "�J�|�,D�$4�$6�$6� 7� 7� 7�+0�D� �i� (� �O� "� "�4�#4� 5� 5� 5���D�J�J� �O� "� "�J�|�,D�$@�$B�$B� C� C� C�+0�D� �i� (��/�D�J��tr+c�t�|j���}|tvr�n|dkr,|j�|j��|j|_n�|dkrd|jd<|j|_n�|dkrd|jd<|j |_n�|tkr^|j�tddd���d |jd <|j�|j��|j|_n>|j�tdd d���d |jd <|j |_d S) Nr�r�r:rr�r-r1r.FrrDT) rrGr r1rHr$r!r"rSrTrrrAr�s r*rRz;HTMLTokenizer.betweenDoctypePublicAndSystemIdentifiersStatesK���{���!�!�� �?� "� "� � �S�[�[� �O� "� "�4�#4� 5� 5� 5���D�J�J� �S�[�[�,.�D� �j� )��F�D�J�J� �S�[�[�,.�D� �j� )��F�D�J�J� �S�[�[� �O� "� "�J�|�,D�$4�$6�$6� 7� 7� 7�+0�D� �i� (� �O� "� "�4�#4� 5� 5� 5���D�J�J� �O� "� "�J�|�,D�$@�$B�$B� C� C� C�+0�D� �i� (��/�D�J��tr+c�$�|j���}|tvr |j|_n�|dvrO|j�tddd���|j�|��|j|_n�|tur^|j�tddd���d|j d<|j�|j ��|j |_n&|j�|��|j|_dSrC) rrGr �"beforeDoctypeSystemIdentifierStater"r1rHrrNrr$r!r�s r*r@z,HTMLTokenizer.afterDoctypeSystemKeywordState)rFr+c���|j���}|tvr�nE|dkrd|jd<|j|_�n'|dkrd|jd<|j|_�n |dkr^|j�tddd���d |jd <|j�|j��|j |_n�|tur^|j�tdd d���d |jd <|j�|j��|j |_n>|j�tddd���d |jd <|j |_d S) Nr�r:rr�r�r-rDr.Frr1T) rrGr r$rSr"rTr1rHrr!rrAr�s r*rWz0HTMLTokenizer.beforeDoctypeSystemIdentifierState=s����{���!�!�� �?� "� "� � �T�\�\�,.�D� �j� )��F�D�J�J� �S�[�[�,.�D� �j� )��F�D�J�J� �S�[�[� �O� "� "�J�|�,D�$@�$B�$B� C� C� C�+0�D� �i� (� �O� "� "�4�#4� 5� 5� 5���D�J�J� �S�[�[� �O� "� "�J�|�,D�$4�$6�$6� 7� 7� 7�+0�D� �i� (� �O� "� "�4�#4� 5� 5� 5���D�J�J� �O� "� "�J�|�,D�$@�$B�$B� C� C� C�+0�D� �i� (��/�D�J��tr+c��|j���}|dkr|j|_�n$|dkr>|j�t ddd���|jdxxdz cc<n�|dkr^|j�t dd d���d |jd <|j�|j��|j|_n||tur^|j�t dd d���d |jd <|j�|j��|j|_n|jdxx|z cc<d S)Nr�r{r-r|r.rr?r�rHFrr1T� rrG�!afterDoctypeSystemIdentifierStater"r1rHrr$r!rr�s r*rSz6HTMLTokenizer.doctypeSystemIdentifierDoubleQuotedStateZrNr+c��|j���}|dkr|j|_�n$|dkr>|j�t ddd���|jdxxdz cc<n�|dkr^|j�t dd d���d |jd <|j�|j��|j|_n||tur^|j�t dd d���d |jd <|j�|j��|j|_n|jdxx|z cc<d S)Nr�r{r-r|r.rr?r�rHFrr1TrZr�s r*rTz6HTMLTokenizer.doctypeSystemIdentifierSingleQuotedStaterrPr+c���|j���}|tvrn�|dkr,|j�|j��|j|_n�|tur^|j�tddd���d|jd<|j�|j��|j|_n4|j�tddd���|j |_dS) Nr�r-r1r.FrrDT) rrGr r1rHr$r!r"rrrAr�s r*r[z/HTMLTokenizer.afterDoctypeSystemIdentifierState�s����{���!�!�� �?� "� "� � �S�[�[� �O� "� "�4�#4� 5� 5� 5���D�J�J� �S�[�[� �O� "� "�J�|�,D�$4�$6�$6� 7� 7� 7�+0�D� �i� (� �O� "� "�4�#4� 5� 5� 5���D�J�J� �O� "� "�J�|�,D�$@�$B�$B� C� C� C��/�D�J��tr+c�<�|j���}|dkr,|j�|j��|j|_nP|turF|j�|��|j�|j��|j|_n dS)Nr�T) rrGr1rHr$r!r"rrNr�s r*rAzHTMLTokenizer.bogusDoctypeState�s����{���!�!�� �3�;�;� �O� "� "�4�#4� 5� 5� 5���D�J�J� �S�[�[� �K� � �d� #� #� #� �O� "� "�4�#4� 5� 5� 5���D�J�J� ��tr+c��g} |�|j�d����|�|j�d����|j���}|tkrnF|dksJ�|ddd�dkr|ddd�|d<n|�|����d�|��}|�d��}|d krPt|��D]*}|j�td d d ����+|� dd ��}|r(|j�td|d ���|j |_ dS)NT�]r�rZ�����z]]r:r{rr-r|r.r?r_) rHrr�rGrrJ�count�ranger1rr�r!r")r'r0rG� nullCountr�s r*rzHTMLTokenizer.cdataSectionState�s����� &� �K�K�� �.�.�s�3�3� 4� 4� 4� �K�K�� �.�.�s�3�3� 4� 4� 4��;�#�#�%�%�D��s�{�{���s�{�{�{�{���8�B�C�C�=�D�(�(�#�B�x����}�D��H���K�K��%�%�%� &��w�w�t�}�}���J�J�x�(�(� � �q�=�=��9�%�%� F� F����&�&� �<�0H�0C�(E�(E�F�F�F�F��<�<��(�3�3�D� � 3� �O� "� "�J�|�,D�,0�$2�$2� 3� 3� 3��^�� ��tr+)Nr)N�__name__� __module__� __qualname__�__doc__r&r6rUrlrnryr!r~r�r�r�r�r�rr�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�rrrr$r"r'rr+r-r/r?rErIrJrMrRr@rWrSrTr[rAr� __classcell__)r)s@r*rrs�������� � � .� .� .� .� .� .�0�0�0� F�F�F�PNT�NT�NT�NT�`H�H�H� $�$�$�8���:��� ���:��� ���$���$ � � �!�!�!�F���0���, � � � � � ����8 � � � � � ����8 � � � � � ����8���������(���(���, � � � � � ����8��� ���*���.���2 � � ���� ���<4�4�4�l���@ � � �D���&���&���2���(���$ � � �+�+�+�Z���.���.���$���&���>���.���"���4���21�1�1�f���(���:���0���0���<���4���(���:���0���0���& � � �������r+rN) � __future__rrr�sixrrL� collectionsrr�sysr � constantsr r r rrrrrrr� _inputstreamr�_trierr`�dictru�objectr�r+r*�<module>rtsi��B�B�B�B�B�B�B�B�B�B�������*�*�*�*�*�*�*�*�������&�&�&�&�&�&�������5�5�5�5�5�5�5�5�-�-�-�-�-�-�-�-�-�-�0�0�0�0�0�0�0�0�,�,�,�,�,�,�)�)�)�)�)�)��������t�H�~�~� ��6����L�L��L�l�l�l�l�l�F�l�l�l�l�lr+
Memory