INDEX
Explanations
conspiracies about control and manipulation
New Auto-Interp
Negative Logits
possesses
0.50
0.48
0.46
0.46
0.45
approximately
0.45
0.44
0.44
0.42
frequently
0.42
POSITIVE LOGITS
idiots
0.82
politicians
0.77
everyone
0.75
everybody
0.72
Politicians
0.69
devs
0.68
mấy
0.66
Zuckerberg
0.66
accountants
0.65
Europe
0.65
Activations Density 0.012%