INDEX
Explanations
proper nouns
proper nouns, particularly names and titles
New Auto-Interp
Negative Logits
horizont
-0.89
horr
-0.88
Thumbnails
-0.79
hemor
-0.79
behav
-0.77
iP
-0.77
ASC
-0.74
livest
-0.73
multic
-0.73
multip
-0.72
POSITIVE LOGITS
enei
1.22
owsky
1.22
ovich
1.13
iani
1.11
owitz
1.07
awi
1.06
ati
1.03
cki
1.01
gger
0.99
olini
0.99
Activations Density 0.238%