INDEX
Explanations
concepts and terms related to consciousness and self-awareness
New Auto-Interp
Negative Logits
rray
-0.15
folios
-0.15
finger
-0.15
egl
-0.14
brains
-0.14
ussian
-0.14
hiba
-0.14
antino
-0.14
isle
-0.14
/xhtml
-0.14
POSITIVE LOGITS
pth
0.17
ipt
0.16
ric
0.15
IDS
0.15
ibal
0.15
||||
0.14
боÑĢ
0.14
죽
0.14
idata
0.14
iy
0.14
Activations Density 0.015%