INDEX
Explanations
expressions of strong interest or enthusiasm towards certain topics
New Auto-Interp
Negative Logits
pta
-0.94
eor
-0.83
ãĥĥãĥī
-0.75
rooms
-0.74
Sachs
-0.70
å¸
-0.70
ãĤ¤ãĥĪ
-0.64
avis
-0.63
ãĥİ
-0.60
äºĶ
-0.60
POSITIVE LOGITS
passionately
1.01
atical
0.99
passion
0.94
ately
0.86
uous
0.86
ful
0.84
uality
0.83
acy
0.83
itiveness
0.80
passions
0.80
Activations Density 0.033%