INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     બદ
    0.47
     Israël
    0.47
     पार्वती
    0.45
     изменить
    0.44
     измени
    0.43
     উদ্
    0.42
    ita
    0.42
    itam
    0.42
     Информация
    0.42
     न्यायमूर्ति
    0.42
    POSITIVE LOGITS
     docks
    0.41
     Charger
    0.38
     dock
    0.38
     beans
    0.37
     opere
    0.37
     wheel
    0.37
     holder
    0.37
    نع
    0.37
    ουλ
    0.37
     bee
    0.36
    Act Density 0.001%

    No Known Activations