INDEX
Explanations
negation phrases and qualifiers that emphasize contrasts or limitations
New Auto-Interp
Negative Logits
venience
-0.15
kal
-0.15
inate
-0.14
tamp
-0.14
dra
-0.14
Īëĭ¤
-0.14
akin
-0.14
fav
-0.14
trinsic
-0.14
ality
-0.13
POSITIVE LOGITS
only
0.17
los
0.17
rab
0.16
venta
0.15
icina
0.15
_bitmap
0.14
Garner
0.14
wor
0.14
erras
0.14
abler
0.14
Activations Density 0.031%