INDEX
Explanations
proper nouns, specifically names of people or places
New Auto-Interp
Negative Logits
ivity
-0.69
Rivals
-0.69
ivities
-0.66
IFIED
-0.65
assets
-0.64
atever
-0.62
atform
-0.61
iffs
-0.60
ITY
-0.60
ively
-0.60
POSITIVE LOGITS
leck
0.81
enthal
0.78
erm
0.76
elf
0.76
hal
0.75
eer
0.75
ering
0.74
e
0.73
erman
0.72
pend
0.72
Activations Density 0.044%