INDEX
Explanations
dates written in a specific format (e.g., "Wednesday, April 2, 2008")
specific dates or chronological information
New Auto-Interp
Negative Logits
aspiration
-0.67
safegu
-0.67
recourse
-0.66
steering
-0.65
swearing
-0.65
numbering
-0.65
dissu
-0.64
armour
-0.64
upkeep
-0.63
supervisor
-0.63
POSITIVE LOGITS
Reviewer
1.06
LOS
0.99
³³³
0.97
SAN
0.95
PRESS
0.95
WASHINGTON
0.95
IRO
0.93
Posted
0.92
GREEN
0.91
PORT
0.90
Activations Density 0.129%