INDEX
Explanations
people's names, particularly surnames
proper nouns and names
New Auto-Interp
Negative Logits
shorth
-0.64
corrid
-0.58
veh
-0.58
subscript
-0.57
Spanish
-0.57
Prim
-0.57
smugglers
-0.56
suffix
-0.56
Assembly
-0.54
Girls
-0.54
POSITIVE LOGITS
Jr
1.12
zyk
0.95
enegger
0.91
opoulos
0.91
sson
0.86
Sr
0.84
III
0.84
oulos
0.82
oglu
0.81
ewski
0.80
Activations Density 0.373%