INDEX
Explanations
words that indicate emphasis or specificity, particularly in the context of descriptors
New Auto-Interp
Negative Logits
anas
-0.16
zano
-0.14
isé
-0.14
Dum
-0.14
pio
-0.14
orne
-0.14
neas
-0.14
/epl
-0.14
order
-0.14
enta
-0.13
POSITIVE LOGITS
peg
0.18
fuse
0.15
766
0.15
828
0.15
694
0.14
Atkins
0.14
PEG
0.14
those
0.14
_Helper
0.14
those
0.14
Activations Density 0.028%