INDEX
Explanations
phrases that involve the word "it" in various contexts
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.19
3:0.11
4:0.02
5:0.05
6:0.08
7:0.20
8:0.06
9:0.06
10:0.09
11:0.05
Negative Logits
INC
-0.98
venge
-0.97
dearly
-0.97
ggle
-0.93
egal
-0.93
onga
-0.91
Curry
-0.91
Liga
-0.89
udence
-0.88
":[{"-0.88
POSITIVE LOGITS
サーティワン
1.20
ldon
1.12
ェ
1.10
idle
1.05
�
0.94
anders
0.94
explanatory
0.93
aning
0.93
Thumbnails
0.92
worms
0.91
Activations Density 0.016%