INDEX
Explanations
words related to mental qualities or sentiments
statements related to confidence and correlation
New Auto-Interp
Negative Logits
Canaver
-0.82
dos
-0.80
batch
-0.79
oultry
-0.78
arnaev
-0.78
arthed
-0.74
asteroids
-0.73
storms
-0.73
Bombs
-0.73
vans
-0.73
POSITIVE LOGITS
loyalty
1.02
ability
1.00
innate
0.94
confidence
0.92
Ability
0.92
happiness
0.88
esteem
0.86
humility
0.85
feeling
0.85
Happiness
0.84
Activations Density 0.849%