INDEX
Explanations
instances of specific punctuation and conjunctions in a context that suggests dialogue or lists
New Auto-Interp
Negative Logits
uber
-0.16
pper
-0.15
baugh
-0.15
ugh
-0.15
ì§ľ
-0.15
ter
-0.14
dash
-0.14
ughs
-0.14
uye
-0.14
UTERS
-0.13
POSITIVE LOGITS
feeling
0.22
having
0.22
after
0.21
faced
0.21
ever
0.21
Having
0.20
knowing
0.20
through
0.19
flush
0.19
via
0.19
Activations Density 0.092%