INDEX
    Explanations

    Tokens marking the assistant role or assistant message header (i.e., the "<|assistant|>"/assistant header indicator).

    New Auto-Interp
    Negative Logits
     میدان
    -0.07
     tempting
    -0.06
    needed
    -0.06
    ,www
    -0.06
     afternoon
    -0.06
    privileged
    -0.06
    	ctrl
    -0.06
     Kons
    -0.06
     bracelets
    -0.06
     loves
    -0.06
    POSITIVE LOGITS
     exhaust
    0.07
     Mental
    0.07
     Competition
    0.07
     sustainable
    0.06
    μενο
    0.06
     initialise
    0.06
     occur
    0.06
     arise
    0.06
     deve
    0.06
     голод
    0.06
    Act Density 0.045%

    No Known Activations