INDEX
Explanations
instances of improvement and the effectiveness of actions or decisions
New Auto-Interp
Negative Logits
heavier
-0.18
excessively
-0.17
bigger
-0.17
greater
-0.16
agraph
-0.16
increased
-0.16
greater
-0.15
Further
-0.15
greatest
-0.14
further
-0.14
POSITIVE LOGITS
much
0.70
much
0.64
Much
0.62
Much
0.60
MUCH
0.56
far
0.47
mucho
0.42
beaucoup
0.40
far
0.38
veel
0.38
Activations Density 0.352%