INDEX
Explanations
years represented as numerical values, specific dates, and information related to political and legal events
New Auto-Interp
Negative Logits
ERY
-0.63
venge
-0.62
ACTION
-0.59
needed
-0.58
avorite
-0.58
IONS
-0.57
astroph
-0.55
immortality
-0.55
ype
-0.54
resemb
-0.53
POSITIVE LOGITS
onwards
1.79
onward
1.54
inception
0.90
downwards
0.80
through
0.77
outset
0.76
ratch
0.73
till
0.71
-'
0.70
until
0.70
Activations Density 0.072%