INDEX
Explanations
comparative phrases emphasizing increased importance or urgency
New Auto-Interp
Negative Logits
ohana
-0.16
gn
-0.16
ares
-0.15
pollo
-0.15
ienes
-0.14
oles
-0.14
peon
-0.14
rogram
-0.14
isoft
-0.14
AGED
-0.14
POSITIVE LOGITS
ever
0.91
-ever
0.67
ever
0.67
EVER
0.66
Ever
0.64
Ever
0.59
EVER
0.38
jamais
0.36
nunca
0.28
Everett
0.26
Activations Density 0.031%