INDEX
Explanations
elements related to political discourse and legal proceedings
New Auto-Interp
Negative Logits
rys
-0.17
/TT
-0.15
obot
-0.15
inker
-0.14
xad
-0.14
umba
-0.14
unders
-0.14
_ACT
-0.14
ctal
-0.14
ýn
-0.14
POSITIVE LOGITS
opponents
0.19
ople
0.18
rivals
0.17
opponent
0.16
opposition
0.15
akov
0.15
athering
0.15
unto
0.15
ather
0.15
ATHER
0.14
Activations Density 0.380%