INDEX
Explanations
the word "Yet" in various contexts
New Auto-Interp
Negative Logits
tained
-0.80
ãĥİ
-0.80
ãĤ¼ãĤ¦ãĤ¹
-0.75
ding
-0.70
omial
-0.70
tein
-0.70
hal
-0.69
ãĤ¡
-0.69
ga
-0.68
packs
-0.67
POSITIVE LOGITS
somehow
1.03
despite
0.90
strangely
0.90
tons
0.85
interestingly
0.78
alas
0.77
nonetheless
0.77
paradox
0.75
persisted
0.75
withstanding
0.74
Activations Density 0.011%