INDEX
Explanations
phrases related to physical thickness or thinness
New Auto-Interp
Negative Logits
OUP
-0.72
bucks
-0.70
zeb
-0.68
ILLE
-0.64
Particip
-0.62
kw
-0.62
PHOTOS
-0.62
eering
-0.61
CDC
-0.61
NL
-0.60
POSITIVE LOGITS
ned
1.49
ning
1.40
ners
1.20
nesses
1.00
ness
0.90
bread
0.89
ening
0.86
slices
0.86
ware
0.85
med
0.85
Activations Density 0.015%