INDEX
Explanations
phrases or statements expressing caution or warning
New Auto-Interp
Negative Logits
if
-0.19
uye
-0.16
msp
-0.16
еÑģли
-0.15
ledon
-0.14
ettel
-0.14
enim
-0.14
ayd
-0.14
utar
-0.14
á»iji
-0.14
POSITIVE LOGITS
chances
0.42
odds
0.35
then
0.35
then
0.34
thì
0.29
consider
0.28
entonces
0.27
Odds
0.26
çļĦè¯Ŀ
0.26
Then
0.25
Activations Density 0.070%