INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     WAIT
    -0.07
    .ALL
    -0.07
    :['
    -0.07
    -0.07
    oe
    -0.07
    ;charset
    -0.07
    edl
    -0.07
     BJ
    -0.07
     فوت
    -0.06
    urf
    -0.06
    POSITIVE LOGITS
    0.06
     gönder
    0.06
     documenting
    0.06
     розп
    0.06
     öngör
    0.06
    micro
    0.06
     Kok
    0.06
    andro
    0.06
    ρώ
    0.06
     gangbang
    0.05
    Act Density 0.071%

    No Known Activations