INDEX
Explanations
phrases related to political conflict and international relations
New Auto-Interp
Negative Logits
ilts
-0.75
Æ
-0.71
ice
-0.70
Pac
-0.67
aneers
-0.67
itud
-0.67
vati
-0.66
oldemort
-0.66
]'
-0.65
apo
-0.65
POSITIVE LOGITS
own
1.24
mileage
1.13
favorite
1.11
favourite
1.09
imagination
0.97
fingertips
0.91
ths
0.91
choice
0.90
selves
0.89
browser
0.88
Activations Density 0.107%