INDEX
Negative Logits
fatos
-0.08
práticas
-0.08
faktor
-0.08
事实
-0.08
melhores
-0.08
Kath
-0.08
table
-0.08
grounding
-0.07
Facts
-0.07
IDs
-0.07
POSITIVE LOGITS
verb
0.08
цвета
0.08
caught
0.08
exhibited
0.07
Ms
0.07
gull
0.07
ად
0.07
Razor
0.07
ғым
0.07
0.07
Activations Density 0.001%