INDEX
Explanations
phrases related to uncertainty or confusion about current events
phrases related to uncertainty or the future
New Auto-Interp
Negative Logits
Horus
-0.73
rique
-0.68
ullah
-0.67
cluding
-0.66
hold
-0.65
picking
-0.63
friends
-0.62
Sut
-0.61
ifle
-0.60
ises
-0.60
POSITIVE LOGITS
Ń·
0.88
¶
0.77
ļéĨĴ
0.75
lems
0.75
¸
0.74
bankrupt
0.74
-+
0.74
°
0.73
overboard
0.71
verning
0.70
Activations Density 0.053%