INDEX
Explanations
comparative statements indicating superiority or complexity
comparisons where one thing is considered to be more significant or powerful than another
New Auto-Interp
Negative Logits
imity
-0.95
EMENT
-0.79
autions
-0.76
bilt
-0.74
Juda
-0.74
antage
-0.71
erman
-0.70
uto
-0.68
ISSION
-0.66
ance
-0.66
POSITIVE LOGITS
usual
1.02
ever
0.81
anything
0.80
placebo
0.77
ours
0.72
average
0.69
average
0.68
acles
0.67
others
0.67
anybody
0.66
Activations Density 0.065%