INDEX
Explanations
words and phrases indicating large sizes or quantities
New Auto-Interp
Negative Logits
b
-0.61
r
-0.60
y
-0.58
lettura
-0.58
by
-0.57
filter
-0.56
G
-0.55
orde
-0.55
V
-0.55
dataclass
-0.55
POSITIVE LOGITS
itſelf
1.50
myſelf
1.35
ormous
1.30
Reſ
1.30
Jefus
1.30
poffe
1.28
raiſ
1.28
greateſt
1.27
iſt
1.27
purpoſe
1.24
Activations Density 0.120%