INDEX
Explanations
expressions of concern or emotional support in conversations
New Auto-Interp
Negative Logits
folks
-0.24
adies
-0.22
fol
-0.22
Fol
-0.18
Brothers
-0.17
Gent
-0.17
folio
-0.17
fol
-0.17
Ladies
-0.17
folk
-0.17
POSITIVE LOGITS
hon
0.28
sweetness
0.28
sugar
0.27
honey
0.27
hun
0.25
sweets
0.25
sug
0.24
sweet
0.23
swe
0.23
sweetheart
0.23
Activations Density 0.219%