INDEX
Explanations
phrases that indicate shedding light on a subject or situation
New Auto-Interp
Head Attr Weights
0:0.27
1:0.03
2:0.02
3:0.05
4:0.03
5:0.04
6:0.04
7:0.01
8:0.40
9:0.05
10:0.00
11:0.01
Negative Logits
rigged
-1.51
demanding
-1.49
preceded
-1.49
dating
-1.48
bred
-1.44
extortion
-1.44
itiz
-1.43
BMI
-1.43
challeng
-1.43
venge
-1.42
POSITIVE LOGITS
rays
2.12
sheds
1.91
azeera
1.71
shed
1.69
tears
1.62
testimonies
1.59
Insight
1.59
clips
1.57
ん
1.56
か
1.55
Activations Density 0.003%