INDEX
Explanations
words associated with measurement or size
New Auto-Interp
Negative Logits
ptions
-0.19
ácil
-0.18
pts
-0.16
ër
-0.16
ption
-0.15
_mE
-0.15
úp
-0.15
anning
-0.15
oney
-0.15
celik
-0.15
POSITIVE LOGITS
arrant
0.20
prere
0.17
fl
0.16
orce
0.16
ure
0.15
iques
0.15
bite
0.14
Tu
0.14
zk
0.14
Tess
0.14
Activations Density 0.020%