INDEX
Explanations
proper nouns
occurrences of a specific end-of-text token
New Auto-Interp
Negative Logits
grass
-0.75
Guinness
-0.72
Nieto
-0.68
Totem
-0.67
Granger
-0.67
CPR
-0.66
Dolphin
-0.65
Sorceress
-0.65
Pom
-0.65
daylight
-0.64
POSITIVE LOGITS
senal
1.32
ansom
1.19
ICH
1.18
outing
1.10
ascal
1.09
acing
1.08
abbit
1.07
agnar
1.07
outine
1.07
haps
1.07
Activations Density 0.031%