INDEX
Explanations
instances of the word "but" used to contrast or introduce an exception
New Auto-Interp
Negative Logits
κÏĮ
-0.16
ã썿ĢĿãģĨ
-0.15
ãĥªãĥ¼ãĤº
-0.14
indir
-0.14
erce
-0.14
ounty
-0.14
ylum
-0.14
/DD
-0.14
incare
-0.14
ipzig
-0.14
POSITIVE LOGITS
then
0.24
maybe
0.23
then
0.23
Then
0.20
_then
0.20
Maybe
0.19
Then
0.19
THEN
0.19
Maybe
0.19
THEN
0.19
Activations Density 0.107%