INDEX
Explanations
vary significantly, greatly, dramatically
New Auto-Interp
Negative Logits
berta
0.71
Accurate
0.69
accurate
0.66
najważ
0.66
unambiguous
0.65
Alone
0.65
اولى
0.64
ッキング
0.63
correcto
0.62
preceded
0.62
POSITIVE LOGITS
considerably
2.92
significantly
2.81
greatly
2.57
substantially
2.52
drastically
2.45
slightly
2.40
markedly
2.28
dramatically
2.28
tremendously
2.17
enormously
2.12
Activations Density 0.930%