INDEX
Explanations
punctuation marks within text strings
instances of punctuation or specific grammatical structures in text
New Auto-Interp
Negative Logits
iam
-0.57
ety
-0.57
xt
-0.56
estinal
-0.54
zel
-0.54
ily
-0.54
¬¼
-0.53
Travels
-0.53
atars
-0.53
irect
-0.53
POSITIVE LOGITS
nor
1.98
anymore
1.39
nor
1.27
yet
1.20
unless
1.13
unless
1.06
except
1.01
though
0.92
but
0.92
preferring
0.92
Activations Density 0.413%