INDEX
Explanations
phrases indicating uniqueness or differentiation
phrases that emphasize uniqueness or distinguishing characteristics
New Auto-Interp
Negative Logits
draining
-0.69
Instr
-0.68
drained
-0.67
vomiting
-0.66
blackmail
-0.65
fax
-0.62
entertained
-0.62
gres
-0.62
itching
-0.62
coerced
-0.62
POSITIVE LOGITS
uniqueness
0.85
undrum
0.79
arily
0.78
insofar
0.78
emale
0.73
pedigree
0.73
allery
0.71
ilial
0.71
distinguishes
0.71
distinguishing
0.71
Activations Density 0.454%