INDEX
Explanations
proper nouns, specifically names of individuals
New Auto-Interp
Negative Logits
ToProps
-0.17
enne
-0.16
stress
-0.14
Atkins
-0.14
antic
-0.14
armac
-0.14
ToWorld
-0.13
uddle
-0.13
apons
-0.13
vap
-0.13
POSITIVE LOGITS
FLICT
0.16
Chronic
0.14
Fans
0.14
osi
0.14
λιά
0.13
968
0.13
')['
0.13
Fil
0.13
UBLE
0.12
Ner
0.12
Activations Density 0.071%