INDEX
Explanations
even / not / never / barely
New Auto-Interp
Negative Logits
perfectly
0.73
perfect
0.73
exclusively
0.68
非常好
0.68
always
0.67
lots
0.65
automatically
0.64
เสมอ
0.64
exclusive
0.64
perfecta
0.63
POSITIVE LOGITS
even
1.73
Even
1.67
Even
1.65
even
1.61
даже
1.46
حتی
1.45
überhaupt
1.35
哪怕
1.33
siquiera
1.31
Bahkan
1.27
Activations Density 0.406%