INDEX
Explanations
days of the week
temporal markers indicating specific days or time periods
New Auto-Interp
Negative Logits
geries
-0.76
spoiler
-0.68
eg
-0.67
origin
-0.63
gery
-0.62
Ly
-0.60
Enchant
-0.58
gib
-0.58
continuity
-0.57
harm
-0.57
POSITIVE LOGITS
reacted
0.96
dismiss
0.87
launched
0.86
opted
0.85
accuse
0.85
undertook
0.84
unveiled
0.84
urged
0.84
announced
0.84
withdrew
0.83
Activations Density 0.156%