INDEX
Explanations
proper nouns, particularly names and places
New Auto-Interp
Negative Logits
354
-0.16
Vac
-0.15
385
-0.15
disposing
-0.14
tü
-0.14
294
-0.14
oyer
-0.14
387
-0.14
rant
-0.14
··
-0.14
POSITIVE LOGITS
azor
0.15
_accessible
0.14
Yug
0.14
-common
0.13
Trafford
0.13
Crus
0.13
:return
0.13
oscope
0.13
Champ
0.13
appe
0.13
Activations Density 0.052%