INDEX
Negative Logits
satisfactory
-0.06
discovers
-0.06
cable
-0.06
marginalized
-0.06
axis
-0.06
illard
-0.06
bron
-0.06
reflex
-0.06
score
-0.06
coles
-0.06
POSITIVE LOGITS
uento
0.07
lean
0.06
ynthia
0.06
-worker
0.06
(proto
0.06
fullName
0.06
Україні
0.06
Countries
0.06
!↵↵↵
0.06
_^(
0.06
Activations Density 0.001%