INDEX
Explanations
comparisons between different things
comparisons that emphasize something's superiority in relation to another
New Auto-Interp
Negative Logits
estern
-0.74
illary
-0.73
osen
-0.68
auri
-0.67
onomy
-0.67
umbn
-0.67
pez
-0.66
auer
-0.66
ãĤ¨
-0.65
kan
-0.65
POSITIVE LOGITS
anything
1.02
necessarily
0.96
any
0.74
outright
0.70
nam
0.69
condone
0.66
adequately
0.66
substance
0.66
substantive
0.65
merely
0.64
Activations Density 0.105%