INDEX
Explanations
adjectives describing qualities or characteristics
descriptive adjectives indicating positive or negative qualities
New Auto-Interp
Negative Logits
cius
-0.79
DERR
-0.70
_>
-0.69
onics
-0.69
FAULT
-0.66
Burnett
-0.64
srfAttach
-0.64
Clouds
-0.62
ysics
-0.61
Doctors
-0.61
POSITIVE LOGITS
piece
0.69
worldly
0.67
stros
0.67
beast
0.66
milestone
0.66
foray
0.65
ranch
0.64
sided
0.64
gap
0.63
initiative
0.63
Activations Density 0.229%