INDEX
Explanations
instances of the word "but" and variations related to it
New Auto-Interp
Negative Logits
uppe
-0.17
ops
-0.16
therefore
-0.15
ilee
-0.15
das
-0.15
§
-0.15
olly
-0.15
bite
-0.15
susp
-0.15
olarity
-0.14
POSITIVE LOGITS
chers
0.24
åĩ¡
0.24
ler
0.24
term
0.24
lers
0.23
ts
0.21
ÑģÑıÑĤ
0.21
rint
0.20
ressing
0.20
ters
0.19
Activations Density 0.053%