INDEX
Explanations
mentions of political figures and government positions
references to governmental positions and official titles
New Auto-Interp
Negative Logits
$.
-0.82
)).
-0.82
respectively
-0.73
.</
-0.72
.).
-0.72
).[
-0.72
}.
-0.70
]."
-0.69
".
-0.69
.[
-0.68
POSITIVE LOGITS
Variant
0.47
historian
0.47
':
0.46
spokesman
0.46
mination
0.44
ideshow
0.43
ctive
0.43
Profile
0.41
odcast
0.41
numbered
0.41
Activations Density 1.780%