INDEX
Explanations
the presence of the word "you."
New Auto-Interp
Negative Logits
itſelf
-0.85
Efq
-0.84
reaſon
-0.74
Reſ
-0.73
Diſ
-0.68
struktion
-0.68
Chriftian
-0.67
Jefus
-0.67
ThemeData
-0.67
(\<
-0.67
POSITIVE LOGITS
را
0.80
音を
0.75
MENAFN
0.73
larını
0.73
線を
0.72
த்தை
0.72
को
0.70
ığını
0.69
いを
0.69
devamını
0.69
Activations Density 0.046%