INDEX
Explanations
phrases related to political activities and processes
New Auto-Interp
Negative Logits
accomp
-0.50
Lomb
-0.46
measuring
-0.45
nodd
-0.45
sprink
-0.45
differe
-0.43
ank
-0.42
mot
-0.42
tagging
-0.42
advant
-0.42
POSITIVE LOGITS
for
1.15
for
0.87
For
0.78
FOR
0.69
For
0.69
ioned
0.65
fort
0.64
irection
0.62
untarily
0.60
to
0.60
Activations Density 0.525%