INDEX
Explanations
words related to positive or supportive actions or statements
phrases related to encouragement or positivity
New Auto-Interp
Negative Logits
tan
-0.70
kos
-0.69
imeter
-0.66
die
-0.65
ness
-0.65
Base
-0.64
tion
-0.64
sth
-0.63
tera
-0.62
yx
-0.62
POSITIVE LOGITS
encouraging
3.33
discouraging
2.41
reassuring
1.78
encouragement
1.72
promising
1.68
inviting
1.63
inspiring
1.60
enticing
1.56
encourage
1.54
fostering
1.52
Activations Density 0.020%