INDEX
Explanations
time-related events or historical comparisons
instances of the word "since" paired with numerical references indicating time periods
New Auto-Interp
Negative Logits
BILITIES
-0.90
bart
-0.81
ODY
-0.72
amina
-0.72
yll
-0.70
inders
-0.67
pta
-0.66
rations
-0.66
bly
-0.65
Reference
-0.65
POSITIVE LOGITS
inception
0.84
rely
0.81
1945
0.77
1949
0.76
1999
0.75
1928
0.75
1895
0.75
1929
0.74
2009
0.74
1979
0.73
Activations Density 0.032%