INDEX
Explanations
phrases that involve degrees of comparison, such as "more," "less," and related adjectives
New Auto-Interp
Negative Logits
absolutamente
-0.66
Exactly
-0.55
posterous
-0.54
Huge
-0.53
huge
-0.53
perfettamente
-0.53
Ế
-0.52
absolutely
-0.52
enorme
-0.52
абсолютно
-0.52
POSITIVE LOGITS
quieter
1.01
newer
0.98
simpler
0.98
firmer
0.97
looser
0.97
younger
0.97
Newer
0.97
bardziej
0.95
coarser
0.94
smaller
0.94
Activations Density 0.427%