INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     krishna
    -0.65
     sentito
    -0.61
     scopri
    -0.61
     aspetta
    -0.58
     dimenti
    -0.54
     lasciato
    -0.52
     dimentic
    -0.51
     trover
    -0.51
     raccont
    -0.50
     cammin
    -0.50
    POSITIVE LOGITS
     Singh
    1.15
    Singh
    1.09
     Sikh
    0.86
     SINGH
    0.75
     Sikhs
    0.74
     Şi
    0.70
     Sing
    0.68
     Singapur
    0.68
     Châ
    0.68
     Lég
    0.68
    Act Density 0.117%

    No Known Activations