INDEX
Explanations
references to individuals or characters within narratives
New Auto-Interp
Negative Logits
TEGER
-0.17
ÄIJT
-0.16
\Blueprint
-0.15
unden
-0.15
ç±
-0.15
Ì£
-0.14
_Tis
-0.14
èľľ
-0.14
åľ¨çº¿éĺħ读
-0.14
kaar
-0.14
POSITIVE LOGITS
James
0.15
gr
0.14
Bris
0.14
S
0.14
920
0.14
Beach
0.14
Mar
0.14
0.14
str
0.14
Michael
0.14
Activations Density 0.038%