INDEX
Explanations
instances of the word "the."
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.08
3:0.07
4:0.08
5:0.09
6:0.07
7:0.08
8:0.09
9:0.09
10:0.07
11:0.08
Negative Logits
bart
-3.17
etsk
-2.90
*/(
-2.82
cocktails
-2.81
ensu
-2.80
ワン
-2.74
warr
-2.74
mers
-2.66
merce
-2.62
salv
-2.59
POSITIVE LOGITS
AIDS
2.98
aed
2.79
Jesus
2.69
Naz
2.67
Faith
2.63
Christian
2.58
amide
2.57
igion
2.57
Christianity
2.52
theological
2.51
Activations Density 0.000%