INDEX
Explanations
names of individuals, particularly with repeating patterns
proper nouns associated with people and nationalities, particularly related to Argentina
New Auto-Interp
Negative Logits
ization
-0.73
IZE
-0.72
essim
-0.71
ctrl
-0.68
istically
-0.68
ieu
-0.67
urgy
-0.67
itton
-0.67
umenthal
-0.66
aunders
-0.65
POSITIVE LOGITS
bies
0.91
gan
0.82
lins
0.78
dies
0.78
ghan
0.77
lem
0.76
oslov
0.76
forth
0.75
ahead
0.74
ffe
0.74
Activations Density 0.018%