INDEX
Explanations
references to consciousness and awareness
New Auto-Interp
Negative Logits
aney
-0.17
EZ
-0.16
ilor
-0.15
issen
-0.15
oste
-0.15
antino
-0.15
endon
-0.15
Äįka
-0.15
stk
-0.14
omp
-0.14
POSITIVE LOGITS
fulness
0.26
ustry
0.21
/body
0.21
edly
0.20
lessly
0.20
sets
0.20
-body
0.20
fully
0.18
cape
0.18
ful
0.17
Activations Density 0.033%