INDEX
Explanations
attends to affirmative words or questions directed from potentially negative responses or confirmatory tokens
New Auto-Interp
Head Attr Weights
0:0.09
1:0.16
2:0.15
3:0.08
4:0.09
5:0.03
6:0.11
7:0.25
Negative Logits
مرئيه
-0.36
>');
-0.30
>");
-0.29
]');
-0.28
relâche
-0.28
πουργ
-0.28
Tuc
-0.27
>");
-0.26
_));
-0.25
MessageTagHelper
-0.25
POSITIVE LOGITS
ejus
0.30
eorum
0.30
houſe
0.29
AspNetCore
0.28
Bronnen
0.28
mengal
0.27
bibliography
0.26
creativecommons
0.26
MethodManager
0.26
colspan
0.25
Activations Density 0.034%