INDEX
Explanations
various forms of punctuation or dashes in the text
New Auto-Interp
Negative Logits
,
-0.77
:
-0.56
etc
-0.55
,....
-0.50
......
-0.48
.
-0.47
...."
-0.46
...")
-0.46
・・・」
-0.46
、、、
-0.46
POSITIVE LOGITS
–
0.98
----------------
0.84
это
0.77
––––
0.77
especially
0.76
namely
0.76
–
0.75
including
0.73
except
0.73
even
0.71
Activations Density 0.341%