INDEX
Explanations
repetitions of the word "the."
New Auto-Interp
Negative Logits
tersebut
-0.57
whatsoever
-0.56
Datuak
-0.56
$("<-0.55
สือ
-0.53
οποίο
-0.53
født
-0.51
оригіналу
-0.51
Passo
-0.50
ualaikum
-0.50
POSITIVE LOGITS
']],
0.71
pinulongan
0.71
"])
0.70
"}")
0.69
تانيه
0.69
₂)
0.68
>')
0.67
exact
0.66
DoubleQuotes
0.66
'])
0.66
Activations Density 0.082%