INDEX
Explanations
instances of understanding or awareness in experiences
New Auto-Interp
Negative Logits
/or
-0.18
aná
-0.17
lle
-0.16
us
-0.15
ker
-0.14
regards
-0.14
jie
-0.14
aso
-0.14
oca
-0.14
ocal
-0.14
POSITIVE LOGITS
bel
0.20
how
0.18
there
0.18
anew
0.16
thru
0.16
how
0.15
adio
0.14
θÏħ
0.14
/*@
0.14
imli
0.14
Activations Density 0.047%