INDEX
Explanations
instances of inquiry and questioning in the text
New Auto-Interp
Negative Logits
ippo
-0.17
olah
-0.16
_TA
-0.15
iesel
-0.15
redient
-0.15
decess
-0.14
amak
-0.14
åħ¥ãĤĮ
-0.14
erer
-0.14
akat
-0.13
POSITIVE LOGITS
ayne
0.16
KIT
0.15
Hem
0.14
ende
0.14
let
0.14
ody
0.14
Kit
0.14
Esper
0.13
æĶ¯
0.13
Europ
0.13
Activations Density 0.028%