INDEX
Explanations
phrases related to personal involvement or perspective in narratives
New Auto-Interp
Negative Logits
strup
-0.16
inke
-0.16
emy
-0.16
APPER
-0.15
pline
-0.15
dio
-0.15
âh
-0.15
wf
-0.15
WF
-0.15
asio
-0.15
POSITIVE LOGITS
yas
0.15
urus
0.14
Env
0.14
Rac
0.14
Lage
0.13
uros
0.13
ihar
0.13
OSE
0.13
ër
0.13
right
0.13
Activations Density 0.133%