INDEX
Explanations
phrases related to political accountability and social relationships
New Auto-Interp
Negative Logits
оÑĢод
-0.16
бол
-0.16
รà¸ģ
-0.15
faction
-0.15
ho
-0.15
amon
-0.15
łĢ
-0.15
uppe
-0.14
lo
-0.14
edian
-0.14
POSITIVE LOGITS
mutual
0.26
Mutual
0.22
between
0.22
between
0.21
mutually
0.21
äºĴ
0.20
两人
0.19
bilateral
0.18
Between
0.18
Between
0.18
Activations Density 0.499%