INDEX
Explanations
instances of the word "smile" or related variations that convey positive emotions
smile variations
New Auto-Interp
Negative Logits
Vanguard
-0.49
NCC
-0.48
Cay
-0.48
Fenn
-0.47
De
-0.47
IV
-0.45
fourth
-0.45
Nether
-0.45
Rik
-0.44
de
-0.44
POSITIVE LOGITS
smile
2.22
Smile
2.03
Smile
2.03
smile
1.90
smiles
1.83
sonrisa
1.79
sourire
1.63
smiled
1.58
Smiles
1.56
smiling
1.48
Activations Density 0.002%