INDEX
Negative Logits
kullan
0.86
사용하는
0.68
verwenden
0.67
kullanımı
0.66
kullanım
0.66
doit
0.66
kullanıl
0.66
അല്ലെങ്കിൽ
0.65
を使用
0.65
사용
0.63
POSITIVE LOGITS
didn
0.75
responded
0.73
последствии
0.71
warned
0.70
knew
0.70
refused
0.69
sighed
0.68
had
0.68
insisted
0.68
dismay
0.67
Activations Density 0.105%