INDEX
Explanations
names of countries
references to the country Cameroon
New Auto-Interp
Negative Logits
ters
-0.77
strous
-0.74
rix
-0.73
nings
-0.71
ially
-0.70
thritis
-0.68
haust
-0.68
thin
-0.66
ptive
-0.66
Beer
-0.66
POSITIVE LOGITS
wana
0.82
sarc
0.80
arov
0.77
Ou
0.76
addafi
0.75
chimpanzees
0.75
autom
0.74
ofi
0.72
chimpan
0.71
atory
0.71
Activations Density 0.035%