INDEX
Explanations
terms related to laws, regulations, and political strategies
terms related to charges or accusations
New Auto-Interp
Negative Logits
iasis
-0.72
¬¼
-0.70
mberg
-0.62
Assass
-0.59
xtap
-0.59
avin
-0.59
hou
-0.59
xton
-0.59
heng
-0.59
tiny
-0.58
POSITIVE LOGITS
than
1.84
than
1.83
Than
1.28
harsher
0.76
wiser
0.71
Tradable
0.69
":"/
0.65
nearer
0.64
"}],"
0.64
quieter
0.62
Activations Density 0.884%