INDEX
Explanations
transition phrases indicating a preference or alternative
the word "rather" in various contexts
New Auto-Interp
Negative Logits
mberg
-0.90
}}}
-0.67
arent
-0.65
reen
-0.64
usha
-0.63
ayne
-0.61
amba
-0.61
kernel
-0.60
esville
-0.60
esian
-0.59
POSITIVE LOGITS
than
1.64
than
1.32
Than
1.13
pathetic
0.83
omin
0.82
ironically
0.82
awkwardly
0.81
irritating
0.80
amusing
0.79
tame
0.76
Activations Density 0.025%