INDEX
Explanations
phrases indicating improvement or enhancement
evaluating for better
New Auto-Interp
Negative Logits
SequentialGroup
-0.64
coroa
-0.57
Palacios
-0.52
Schofield
-0.52
CardModule
-0.50
Carrasco
-0.49
skraft
-0.49
Carrillo
-0.49
sessionId
-0.49
Murdoch
-0.48
POSITIVE LOGITS
better
1.71
Better
1.70
better
1.66
Better
1.59
BETTER
1.50
mejor
1.26
bessere
1.19
mieux
1.15
besseren
1.14
melhor
1.13
Activations Density 0.017%