INDEX
Explanations
positive adjectives describing things as pleasant or enjoyable
instances of the word "nice" used in various contexts
New Auto-Interp
Negative Logits
arians
-0.82
ogens
-0.81
authorized
-0.78
ichen
-0.74
omics
-0.73
arian
-0.72
eligible
-0.71
inant
-0.71
interrupted
-0.69
rained
-0.68
POSITIVE LOGITS
gesture
0.87
nice
0.83
additions
0.80
bye
0.79
breeze
0.79
touches
0.77
little
0.76
bye
0.76
enough
0.76
ño
0.75
Activations Density 0.030%