INDEX
Explanations
determiners at the beginning of phrases
New Auto-Interp
Head Attr Weights
0:0.08
1:0.07
2:0.08
3:0.09
4:0.08
5:0.06
6:0.09
7:0.07
8:0.08
9:0.08
10:0.08
11:0.08
Negative Logits
anchors
-3.17
anchored
-3.01
ゼウス
-2.99
foothold
-2.95
guides
-2.90
heights
-2.81
anchor
-2.80
sits
-2.77
staircase
-2.77
liner
-2.75
POSITIVE LOGITS
Alert
3.22
Nusra
3.14
DCS
2.85
Unemployment
2.79
awatts
2.77
wav
2.69
AAA
2.67
afa
2.67
odox
2.65
FA
2.63
Activations Density 0.000%