INDEX
Explanations
phrases or words related to political or bilateral contexts where multiple entities are involved
references to the word "both" to indicate consideration of two opposing sides or perspectives
New Auto-Interp
Negative Logits
lé
-0.75
ugu
-0.74
potion
-0.73
lo
-0.71
uez
-0.70
dq
-0.69
vez
-0.68
¥
-0.67
renheit
-0.67
eln
-0.66
POSITIVE LOGITS
sexes
1.52
genders
1.28
sides
1.28
halves
1.27
parties
0.89
ends
0.84
kinds
0.84
extremes
0.83
ocating
0.83
coasts
0.80
Activations Density 0.048%