INDEX
Explanations
adjectives and nouns describing the quality or intensity of various situations or attributes
phrases that express states of being or evaluations
New Auto-Interp
Negative Logits
zar
-0.84
cffffcc
-0.78
BUS
-0.73
CrossRef
-0.72
pter
-0.72
vier
-0.71
tyard
-0.71
Calm
-0.71
yna
-0.70
isky
-0.69
POSITIVE LOGITS
warranted
0.67
delusion
0.65
respective
0.64
entitlement
0.63
fian
0.61
Sandwich
0.61
ever
0.60
individual
0.60
pill
0.60
depend
0.59
Activations Density 0.116%