INDEX
Explanations
comparative adjectives
New Auto-Interp
Negative Logits
EVA
-0.77
ainted
-0.76
EP
-0.71
odor
-0.69
shire
-0.68
vp
-0.65
upuncture
-0.65
hr
-0.64
ulations
-0.64
syn
-0.63
POSITIVE LOGITS
than
1.85
than
1.58
Than
1.55
versions
0.89
iating
0.88
"$:/
0.81
sibling
0.77
ado
0.75
generations
0.73
installments
0.72
Activations Density 0.548%