INDEX
    Explanations

    punctuation

    New Auto-Interp
    Negative Logits
     konz
    -0.07
     yanı
    -0.06
    =index
    -0.06
     esl
    -0.06
    lığa
    -0.06
     فلس
    -0.06
    loff
    -0.06
     kaf
    -0.06
    -0.06
    iều
    -0.06
    POSITIVE LOGITS
     altogether
    0.07
    blocking
    0.07
     Bom
    0.06
    09
    0.06
     Fun
    0.06
     lor
    0.06
    فضل
    0.06
     Violence
    0.06
    _MON
    0.06
    0.06
    Act Density 0.001%

    No Known Activations