INDEX
Explanations
names of individuals
proper nouns, particularly names of individuals and entities
New Auto-Interp
Negative Logits
merce
-0.65
ptives
-0.64
PASS
-0.62
peak
-0.62
zone
-0.61
zones
-0.60
reductions
-0.59
attachments
-0.59
holders
-0.58
ACTION
-0.58
POSITIVE LOGITS
iane
1.25
aldo
1.11
andro
1.11
ijn
1.09
acio
1.01
ael
1.01
ardo
0.99
oslav
0.98
iano
0.98
ie
0.98
Activations Density 0.266%