INDEX
Explanations
adjectives and descriptions suggesting a certain quality or characteristic
descriptive adjectives that convey negative qualities or experiences
New Auto-Interp
Negative Logits
abwe
-0.77
osponsors
-0.69
MK
-0.67
CF
-0.65
notably
-0.65
amera
-0.64
ocked
-0.64
uilding
-0.63
BW
-0.63
concentrating
-0.63
POSITIVE LOGITS
est
1.71
liest
1.37
iest
1.15
hest
1.14
ness
1.13
culmination
0.98
truth
0.97
thing
0.96
EST
0.96
conclusion
0.96
Activations Density 0.251%