INDEX
Explanations
the presence of the word "urge"
instances of the word "urge."
New Auto-Interp
Negative Logits
ammy
-0.71
Half
-0.69
ney
-0.68
este
-0.67
mbuds
-0.67
Seym
-0.64
mberg
-0.64
missions
-0.64
icator
-0.64
çĦ
-0.64
POSITIVE LOGITS
urge
1.19
urges
0.99
incent
0.96
ingly
0.79
FontSize
0.78
tempted
0.74
urging
0.74
reminding
0.73
ĸļ
0.70
TextColor
0.70
Activations Density 0.009%