INDEX
Explanations
comparative words and phrases focusing on improvement
New Auto-Interp
Negative Logits
EVA
-0.68
ainted
-0.67
shire
-0.65
EP
-0.64
mberg
-0.64
vp
-0.64
sm
-0.61
odor
-0.60
aly
-0.60
ayne
-0.59
POSITIVE LOGITS
than
1.82
Than
1.58
than
1.56
iating
1.00
"$:/
0.87
behaved
0.84
versions
0.83
iator
0.82
iation
0.81
suited
0.75
Activations Density 0.906%