INDEX
Explanations
instances of the name "Arafat"
repeated mentions of the name "Ara" and references to the individual "Ariel"
New Auto-Interp
Negative Logits
ership
-0.86
eners
-0.82
Values
-0.76
bread
-0.74
Ohio
-0.73
Nationwide
-0.71
itution
-0.71
flush
-0.71
ulative
-0.69
liness
-0.69
POSITIVE LOGITS
Ara
1.02
rison
0.96
byss
0.92
yip
0.86
fat
0.85
thur
0.85
asca
0.84
fen
0.82
izabeth
0.78
isma
0.78
Activations Density 0.016%