INDEX
Explanations
phrases related to negation and absence
phrases that indicate quantities or amounts
New Auto-Interp
Negative Logits
culosis
-0.73
alion
-0.68
icism
-0.68
erity
-0.59
bara
-0.58
CRE
-0.58
issance
-0.58
fty
-0.56
Royale
-0.56
EMA
-0.56
POSITIVE LOGITS
abouts
0.78
themselves
0.77
types
0.73
expressions
0.69
extensions
0.69
embodiments
0.68
selves
0.68
interchangeable
0.66
mouths
0.65
products
0.65
Activations Density 0.791%