INDEX
Explanations
words related to political analysis and commentary
New Auto-Interp
Negative Logits
tnc
-0.76
enter
-0.76
isible
-0.76
enture
-0.74
inated
-0.73
istance
-0.70
ut
-0.69
edu
-0.69
lees
-0.68
sbm
-0.68
POSITIVE LOGITS
admittedly
1.14
fortunately
1.00
thankfully
0.97
interestingly
0.96
technically
0.90
luckily
0.90
beware
0.87
occasionally
0.86
ideally
0.86
curiously
0.85
Activations Density 0.079%