INDEX
Explanations
words related to proper nouns, especially names
names or titles of individuals or entities
New Auto-Interp
Negative Logits
åī
-0.95
enthusi
-0.90
Dynam
-0.81
Azerb
-0.80
Mub
-0.77
backlog
-0.76
garg
-0.76
Bub
-0.76
kb
-0.76
Rosenberg
-0.75
POSITIVE LOGITS
ce
1.23
ire
1.15
onse
1.15
te
1.15
ide
1.15
esse
1.15
asse
1.13
ote
1.12
le
1.12
alle
1.10
Activations Density 0.178%