INDEX
Explanations
years or specific time periods
phrases that mention specific time references
New Auto-Interp
Negative Logits
unch
-0.70
Sounds
-0.64
actor
-0.62
amulet
-0.61
AT
-0.61
dict
-0.61
acus
-0.60
ele
-0.60
wo
-0.60
etc
-0.60
POSITIVE LOGITS
soever
1.20
upon
0.84
quickShipAvailable
0.73
abouts
0.71
irlf
0.71
faced
0.69
they
0.68
temperatures
0.67
transitioning
0.63
ipal
0.62
Activations Density 0.043%