INDEX
Explanations
phrases related to having power and control
questions that seek opinions or observations about a topic
New Auto-Interp
Negative Logits
godd
-0.72
Godd
-0.70
ocide
-0.68
Fuck
-0.68
death
-0.64
*.
-0.64
Guant
-0.63
bush
-0.63
goddamn
-0.62
Enlarge
-0.62
POSITIVE LOGITS
however
1.10
therefore
0.91
moreover
0.91
furthermore
0.86
emphas
0.85
meanwhile
0.84
also
0.79
especially
0.76
particularly
0.74
util
0.74
Activations Density 1.694%