INDEX
Explanations
the phrase "must watch," indicating a strong recommendation or urgency regarding certain content
New Auto-Interp
Head Attr Weights
0:0.11
1:0.22
2:0.06
3:0.12
4:0.02
5:0.11
6:0.13
7:0.01
8:0.03
9:0.07
10:0.03
11:0.04
Negative Logits
somew
-1.46
tongues
-1.45
illin
-1.43
cel
-1.39
Fulton
-1.35
coerc
-1.33
demonic
-1.32
illery
-1.32
Perkins
-1.32
ilial
-1.31
POSITIVE LOGITS
>>>
1.84
×
1.78
.>>
1.76
=#
1.67
OTOS
1.67
»
1.64
>:
1.62
RELATED
1.60
VIDEOS
1.56
>>
1.56
Activations Density 0.000%