INDEX
Explanations
acknowledging and contrasting
New Auto-Interp
Negative Logits
全部
0.96
Usually
0.93
Probably
0.93
Mostly
0.91
Usually
0.90
Hopefully
0.89
Hopefully
0.89
Presumably
0.88
possibles
0.87
Probably
0.85
POSITIVE LOGITS
despite
2.09
despite
1.77
unlike
1.76
unlike
1.59
beneath
1.58
несмотря
1.46
although
1.43
amidst
1.42
apesar
1.39
contrary
1.38
Activations Density 0.210%