INDEX
Explanations
phrases relating to public opinion and desires
New Auto-Interp
Negative Logits
Aim
-0.80
ourage
-0.76
çͰ
-0.73
adem
-0.69
catentry
-0.68
stead
-0.66
length
-0.66
rentices
-0.65
fell
-0.63
ema
-0.62
POSITIVE LOGITS
their
0.76
electing
0.76
themselves
0.71
louder
0.70
icators
0.68
THEIR
0.66
TVs
0.65
theirs
0.64
surprises
0.63
democracy
0.62
Activations Density 0.338%