INDEX
Explanations
dates presented in relation to significant events
temporal phrases and markers indicating the passage of time
New Auto-Interp
Negative Logits
pecially
-0.87
emo
-0.72
estine
-0.71
serv
-0.71
urat
-0.70
arsh
-0.69
organic
-0.67
hots
-0.66
oras
-0.66
OW
-0.65
POSITIVE LOGITS
still
0.94
reckoning
0.80
nearing
0.76
Congratulations
0.73
Mehran
0.71
anew
0.70
still
0.68
huh
0.68
hindsight
0.66
unresolved
0.66
Activations Density 0.373%