INDEX
Explanations
locations where people are from
phrases indicating geographic origin or location
New Auto-Interp
Negative Logits
ratulations
-0.81
faced
-0.80
few
-0.78
potion
-0.68
tarians
-0.67
onential
-0.66
adata
-0.66
attribute
-0.65
icipated
-0.65
comfort
-0.64
POSITIVE LOGITS
afar
1.44
abroad
1.12
Ethiopia
1.00
whence
0.98
Latvia
0.97
somewhere
0.97
Denmark
0.97
Nigeria
0.96
Finland
0.95
Hungary
0.95
Activations Density 0.105%