INDEX
Explanations
references to the word "with" in various contexts
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.09
3:0.06
4:0.16
5:0.03
6:0.04
7:0.30
8:0.03
9:0.04
10:0.06
11:0.09
Negative Logits
ibaba
-2.15
fast
-1.58
arching
-1.47
-1.45
qu
-1.45
arming
-1.44
coming
-1.43
asive
-1.42
isance
-1.42
qual
-1.42
POSITIVE LOGITS
Tsukuyomi
1.76
actionDate
1.72
withd
1.63
taxp
1.53
linem
1.52
beforehand
1.51
burner
1.51
Abedin
1.48
politely
1.48
Hou
1.46
Activations Density 0.002%