INDEX
Explanations
before, then, while, until, behind
New Auto-Interp
Negative Logits
많이
0.93
daarom
0.91
necessariamente
0.86
ofta
0.85
وبالتالي
0.85
često
0.84
한다는
0.82
doctrinal
0.82
如果在
0.82
derfor
0.81
POSITIVE LOGITS
before
1.02
while
1.02
despite
0.96
then
0.94
until
0.94
behind
0.89
Then
0.85
beside
0.84
whilst
0.84
menacing
0.82
Activations Density 0.270%