INDEX
Explanations
references to the Quran or related religious content
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.08
3:0.07
4:0.07
5:0.08
6:0.08
7:0.08
8:0.08
9:0.07
10:0.08
11:0.08
Negative Logits
Buc
-3.11
Oregon
-3.08
doctors
-3.05
physicians
-2.85
Oregon
-2.74
Doctors
-2.72
Arizona
-2.68
Tucson
-2.63
Louisville
-2.60
Orlando
-2.60
POSITIVE LOGITS
;;;;;;;;;;;;
3.23
rek
3.10
akespeare
2.89
Fey
2.86
edom
2.83
Inher
2.83
plom
2.82
ugu
2.76
RELE
2.72
Interstellar
2.70
Activations Density 0.000%