INDEX
Explanations
historical events or biographical information
New Auto-Interp
Negative Logits
:/
-0.69
abe
-0.65
ounty
-0.61
unemploy
-0.60
$$
-0.60
constitu
-0.58
selves
-0.57
unequivocally
-0.56
weights
-0.56
reon
-0.56
POSITIVE LOGITS
addition
1.34
spite
1.23
juries
1.18
hindsight
1.16
coming
1.12
flation
1.11
jured
1.10
contrast
1.10
retrospect
1.08
accordance
1.08
Activations Density 0.132%