INDEX
Explanations
concluding or contrasting statements
New Auto-Interp
Negative Logits
이건
0.21
jeans
0.20
họ
0.20
это
0.20
이거
0.20
swirls
0.19
headlights
0.19
Didn
0.19
bunnies
0.19
combos
0.19
POSITIVE LOGITS
Therefore
0.52
Therefore
0.49
Consequently
0.48
Furthermore
0.48
Accordingly
0.43
Поэтому
0.42
therefore
0.42
Moreover
0.41
Consequently
0.41
因此
0.40
Activations Density 0.086%