INDEX
Explanations
mentions of political statements or positions within a text
instances of political terminology and references to political parties
New Auto-Interp
Negative Logits
ciplinary
-0.75
":[
-0.65
Sep
-0.58
(?,
-0.57
ascus
-0.56
romeda
-0.56
enes
-0.55
Interest
-0.54
pard
-0.53
Picture
-0.52
POSITIVE LOGITS
!).
1.08
?).
1.01
!),
0.89
!)
0.86
?)
0.85
).
0.83
).
0.81
)</
0.80
-)
0.77
?),
0.76
Activations Density 0.975%