INDEX
Negative Logits
DIFF
0.38
killing
0.38
⎢
0.37
BorderStyle
0.37
щает
0.36
wskaz
0.36
윕
0.36
عمليه
0.36
Ị
0.36
lenie
0.36
POSITIVE LOGITS
ێن
0.42
adopted
0.39
답
0.39
ऑड
0.38
മൊ
0.38
arab
0.38
답
0.38
pemer
0.37
Miss
0.36
muss
0.36
Activations Density 0.003%