INDEX
Explanations
warmth and positivity in descriptions or interactions
warm adjectives or phrases describing comfort and friendliness
New Auto-Interp
Negative Logits
IBLE
-0.67
Chaff
-0.64
unlaw
-0.63
arge
-0.62
bankrupt
-0.62
$$
-0.62
issors
-0.61
legalized
-0.61
Goff
-0.61
gur
-0.61
POSITIVE LOGITS
achine
1.47
est
1.08
fuzz
1.02
fuzzy
1.02
hearted
0.96
estone
0.90
blooded
0.89
ening
0.88
ests
0.85
welcome
0.84
Activations Density 0.026%