INDEX
Explanations
phrases related to uncertainty or confusion about ongoing events
mentions of the phrase "what's going on."
New Auto-Interp
Negative Logits
Horus
-0.74
peria
-0.72
ullah
-0.72
ablo
-0.67
este
-0.66
ament
-0.63
heim
-0.63
ilitarian
-0.62
essor
-0.62
osal
-0.62
POSITIVE LOGITS
downhill
0.86
Ń·
0.82
verning
0.81
lems
0.75
±
0.74
ggle
0.74
ãĥ£
0.73
Īè
0.72
overboard
0.72
ij
0.71
Activations Density 0.053%