INDEX
Explanations
comparative adjectives indicating size, speed, and weight
New Auto-Interp
Negative Logits
T
-0.80
p
-0.73
Z
-0.72
G
-0.72
mit
-0.72
Go
-0.69
Ar
-0.68
S
-0.68
Al
-0.67
F
-0.67
POSITIVE LOGITS
leſs
1.88
myſelf
1.65
happier
1.54
healthier
1.52
shallo
1.51
easier
1.51
quieter
1.50
thicker
1.50
thinner
1.50
heavier
1.50
Activations Density 0.164%