INDEX
Explanations
punctuation at the end of sentences
New Auto-Interp
Negative Logits
ogene
-0.63
vine
-0.62
pires
-0.61
icent
-0.60
inous
-0.60
sergeant
-0.59
uner
-0.59
itely
-0.59
idious
-0.58
genus
-0.58
POSITIVE LOGITS
Additionally
1.51
However
1.50
Those
1.48
Their
1.44
They
1.44
Though
1.41
Although
1.41
According
1.41
Unfortunately
1.41
These
1.41
Activations Density 0.126%