INDEX
Explanations
political content related to a specific individual and their actions
New Auto-Interp
Negative Logits
Originally
-0.45
Completed
-0.40
cohol
-0.39
Located
-0.37
Optical
-0.37
Previously
-0.36
Normally
-0.35
BMC
-0.35
Previous
-0.35
Click
-0.35
POSITIVE LOGITS
scapego
0.52
damned
0.51
repud
0.49
goddamn
0.47
neocons
0.47
etheless
0.47
acquies
0.46
genuinely
0.46
prud
0.46
complicit
0.46
Activations Density 11.176%