INDEX
    Explanations

    sentences expressing surprise or unexpected outcomes

    New Auto-Interp
    Negative Logits
    ctors
    -0.07
    alat
    -0.06
    cheid
    -0.06
    uchs
    -0.06
    %D
    -0.06
    ÑĢаг
    -0.06
    WARDED
    -0.06
    ĥ
    -0.06
    LI
    -0.05
    èķ
    -0.05
    POSITIVE LOGITS
    нак
    0.06
    ushima
    0.06
     дело
    0.06
     Widow
    0.06
     numero
    0.06
     pha
    0.06
    .Editor
    0.06
    akens
    0.06
    oki
    0.06
    -refresh
    0.06
    Act Density 0.002%

    No Known Activations