INDEX
Explanations
words related to birthplaces
phrases that indicate a person’s place of birth
New Auto-Interp
Negative Logits
phasis
-0.90
phis
-0.85
olicy
-0.84
qqa
-0.82
raviolet
-0.82
awaru
-0.81
eredith
-0.80
dfx
-0.80
uyomi
-0.78
alach
-0.77
POSITIVE LOGITS
ness
0.89
born
0.83
lings
0.79
ãĥĥãĥī
0.79
nesses
0.78
ling
0.73
smith
0.72
ford
0.69
stein
0.68
aways
0.68
Activations Density 0.019%