INDEX
Explanations
questions related to self-reflection and self-doubt
New Auto-Interp
Negative Logits
inery
-0.78
iHUD
-0.76
©¶æ¥µ
-0.74
abba
-0.69
alde
-0.67
pione
-0.67
20439
-0.67
abor
-0.66
tails
-0.65
jong
-0.64
POSITIVE LOGITS
?
1.98
?:
1.86
?"
1.82
?'
1.82
?)
1.76
?),
1.72
?",
1.72
?).
1.70
?!
1.70
...?
1.69
Activations Density 2.787%