INDEX
Explanations
fragments of dialogue and expressions related to emotions and existential concepts
New Auto-Interp
Negative Logits
abis
-0.17
ddy
-0.16
olle
-0.15
еÑĢг
-0.14
chez
-0.14
ħn
-0.14
neh
-0.14
etre
-0.14
ovÃŃd
-0.13
BuilderInterface
-0.13
POSITIVE LOGITS
621
0.16
gonna
0.15
Division
0.15
Princip
0.14
gon
0.14
Morales
0.14
Ker
0.14
intl
0.13
Stand
0.13
_thumb
0.13
Activations Density 0.040%