INDEX
Explanations
negations or terms indicating lack or absence
New Auto-Interp
Negative Logits
brain
-0.16
moz
-0.15
169
-0.15
ullet
-0.15
brain
-0.15
jerne
-0.14
trans
-0.14
Brain
-0.14
brains
-0.14
uide
-0.14
POSITIVE LOGITS
bsolute
0.16
laÄį
0.15
ë¡Ŀ
0.14
inia
0.14
ifa
0.14
kles
0.14
GBK
0.14
observer
0.14
lue
0.14
936
0.14
Activations Density 0.000%