INDEX
Explanations
proper nouns, specifically names and geographic locations
New Auto-Interp
Negative Logits
Monfieur
-0.76
purpoſe
-0.72
raiſ
-0.69
onely
-0.67
houſe
-0.65
Efq
-0.65
femininas
-0.64
marinho
-0.64
leaſt
-0.63
ſet
-0.63
POSITIVE LOGITS
ger
0.75
lat
0.70
Gre
0.68
ParallelGroup
0.67
ger
0.67
Utf
0.66
gre
0.66
Lat
0.66
utf
0.66
Yu
0.65
Activations Density 1.869%