INDEX
Explanations
positive adjectives describing someone's character or abilities
words expressing praise and criticism about people's abilities or qualities
New Auto-Interp
Negative Logits
idav
-0.73
ulton
-0.68
ARB
-0.68
extensive
-0.65
armac
-0.63
cific
-0.63
risome
-0.62
feasibility
-0.61
elaborate
-0.60
yrinth
-0.60
POSITIVE LOGITS
enough
1.05
enough
0.88
Enough
0.80
friends
0.76
negotiator
0.76
sleeper
0.75
undermin
0.72
blooded
0.72
looking
0.71
eyes
0.71
Activations Density 0.287%