INDEX
Explanations
references to the city of San Francisco
city names and specific locations
New Auto-Interp
Negative Logits
ãĤ¨ãĥ«
-0.76
ãĥķãĤ©
-0.66
è£ħ
-0.63
ãĥ¼ãĤ¯
-0.63
ãĥ¡
-0.63
Gender
-0.63
xual
-0.62
ãĥīãĥ©ãĤ´ãĥ³
-0.60
ãĤ¶
-0.59
ãĤ¹ãĥĪ
-0.57
POSITIVE LOGITS
Lambert
0.64
vey
0.62
amate
0.60
atton
0.58
NESS
0.57
telling
0.56
heit
0.54
reciation
0.53
lege
0.53
ilon
0.53
Activations Density 0.590%