INDEX
Explanations
names of specific individuals
references to specific individuals and locations
New Auto-Interp
Negative Logits
mber
-0.91
ppe
-0.85
vent
-0.77
cha
-0.72
rano
-0.68
ktop
-0.68
Fiat
-0.67
ffic
-0.67
chin
-0.66
venth
-0.65
POSITIVE LOGITS
igans
0.81
issance
0.78
Greenwald
0.69
acles
0.67
ipedia
0.67
rings
0.66
Duffy
0.65
oat
0.65
thening
0.65
uckland
0.64
Activations Density 0.025%