INDEX
Explanations
proper nouns related to names or titles
proper nouns, specifically names and entities
New Auto-Interp
Negative Logits
holders
-1.01
holder
-0.79
erness
-0.73
uve
-0.72
mble
-0.68
itures
-0.64
hetically
-0.62
igating
-0.61
hetical
-0.61
enrichment
-0.60
POSITIVE LOGITS
oint
1.06
unction
1.04
ealous
1.03
ournals
1.01
upiter
0.98
utsu
0.93
ansen
0.89
igsaw
0.89
acket
0.85
ordan
0.84
Activations Density 0.100%