INDEX
Negative Logits
साउथ
0.46
utors
0.46
sächlich
0.45
apses
0.44
ivores
0.44
væ
0.44
odym
0.44
calright
0.44
employer
0.43
salari
0.43
POSITIVE LOGITS
thorough
0.50
یشه
0.48
termin
0.45
Troubles
0.45
큰
0.44
ния
0.44
code
0.44
easily
0.44
chen
0.43
Thorough
0.43
Activations Density 0.005%