INDEX
Explanations
strong verbs or actions related to politics and legislation
New Auto-Interp
Negative Logits
maker
-0.65
medium
-0.64
sacrament
-0.63
vul
-0.63
transitions
-0.62
examiner
-0.62
uniqueness
-0.61
disabilities
-0.61
identification
-0.61
nudity
-0.60
POSITIVE LOGITS
udging
1.04
inged
1.04
ashing
1.02
isively
1.01
oused
0.98
ashed
0.98
aciously
0.97
umbled
0.96
agged
0.94
uously
0.93
Activations Density 0.174%