INDEX
Explanations
phrases related to political activities or events
references to political events and processes
New Auto-Interp
Negative Logits
ratulations
-0.58
!",
-0.57
/,
-0.53
",
-0.51
ortun
-0.50
Which
-0.49
POSE
-0.49
',
-0.47
%,
-0.47
aque
-0.47
POSITIVE LOGITS
.�
0.86
.''.
0.86
but
0.85
.''
0.76
.ãĢį
0.75
}.
0.74
albeit
0.73
}}
0.72
but
0.72
.</
0.71
Activations Density 0.997%