INDEX
Explanations
occurrences of the word "but" and its variations, indicating contrasts or opposition in the text
New Auto-Interp
Negative Logits
ops
-0.17
es
-0.17
ziel
-0.17
therefore
-0.16
alike
-0.16
xies
-0.15
das
-0.15
oley
-0.15
olarity
-0.15
susp
-0.15
POSITIVE LOGITS
lers
0.22
term
0.22
ler
0.21
åĩ¡
0.21
ÑģÑıÑĤ
0.21
ts
0.20
cher
0.20
yl
0.20
ters
0.20
chers
0.20
Activations Density 0.041%