INDEX
Explanations
words related to political discussions and negotiations
New Auto-Interp
Negative Logits
lihood
-0.85
Dangerous
-0.80
hower
-0.70
¿½
-0.67
CRIPTION
-0.65
=-=-
-0.65
ÙIJ
-0.64
Beir
-0.61
Apostle
-0.61
hoe
-0.61
POSITIVE LOGITS
ulators
1.19
ulatory
1.16
rett
1.14
ulations
1.14
ardless
1.10
arded
1.10
nant
1.09
isters
1.08
ulus
1.07
roup
1.04
Activations Density 0.008%