INDEX
Explanations
mentions of the word "fine" used positively or in a complimentary context
New Auto-Interp
Negative Logits
omorphic
-0.85
omorph
-0.81
ulhu
-0.81
leaders
-0.76
raltar
-0.75
ilee
-0.75
thwarted
-0.75
marine
-0.73
Hack
-0.71
riber
-0.70
POSITIVE LOGITS
Gael
1.05
tuning
1.02
tuned
0.85
tune
0.79
fine
0.77
linen
0.70
Fine
0.70
AY
0.69
MENTS
0.69
Tune
0.67
Activations Density 0.854%