INDEX
Explanations
personal information or details about someone's actions
punctuation marks, particularly commas
New Auto-Interp
Negative Logits
¬¼
-0.71
sylv
-0.52
ety
-0.51
asca
-0.51
herent
-0.50
ocl
-0.48
UF
-0.48
Cth
-0.48
aha
-0.46
igen
-0.46
POSITIVE LOGITS
huh
0.84
however
0.83
albeit
0.81
which
0.77
though
0.77
although
0.77
but
0.76
namely
0.75
including
0.71
preferably
0.71
Activations Density 0.556%