INDEX
Explanations
phrases indicating important actions, decisions, or changes in context
New Auto-Interp
Negative Logits
ulman
-0.48
lui
-0.43
WE
-0.41
ORE
-0.41
we
-0.41
faisons
-0.41
mó
-0.41
val
-0.39
حاب
-0.39
colari
-0.38
POSITIVE LOGITS
your
1.18
seus
1.12
saraba
1.11
svého
1.08
своих
1.04
suas
1.04
الحره
1.03
svých
1.03
своего
1.00
seu
1.00
Activations Density 0.834%