INDEX
Explanations
punctuation marks
punctuation marks, particularly commas
New Auto-Interp
Negative Logits
corrid
-0.72
gow
-0.71
taboola
-0.65
robe
-0.64
rique
-0.61
fronts
-0.60
seizure
-0.59
ruck
-0.59
rients
-0.59
ney
-0.58
POSITIVE LOGITS
but
0.89
uh
0.88
um
0.86
BUT
0.81
but
0.80
except
0.80
albeit
0.79
alas
0.78
namely
0.76
oh
0.71
Activations Density 0.260%