INDEX
Explanations
emphasizing words that indicate certainty or emphasis
New Auto-Interp
Negative Logits
Sod
-0.65
Sod
-0.60
hende
-0.59
สาย
-0.59
serem
-0.59
charging
-0.58
jap
-0.56
scolaires
-0.55
shaking
-0.54
Shown
-0.54
POSITIVE LOGITS
have
0.97
gave
0.85
could
0.84
verläs
0.82
had
0.82
$")
0.82
Šaltiniai
0.81
avía
0.81
must
0.79
nahilalakip
0.78
Activations Density 0.264%