INDEX
Explanations
mentions of recent events or occurrences
the occurrence of the word "Recently."
New Auto-Interp
Negative Logits
imm
-0.74
onite
-0.66
Strait
-0.66
afort
-0.63
ilot
-0.62
orney
-0.61
llah
-0.60
anza
-0.60
cair
-0.59
igm
-0.59
POSITIVE LOGITS
theless
0.76
Updated
0.71
Tonight
0.69
Supplement
0.68
Update
0.67
Yesterday
0.67
episode
0.67
antine
0.66
ynski
0.65
Recently
0.64
Activations Density 0.017%