INDEX
Explanations
Arguments, concerns, conditions, efforts, actions
New Auto-Interp
Negative Logits
is
0.94
undergoes
0.71
corresponds
0.70
is
0.68
isn
0.68
needs
0.68
was
0.66
wants
0.63
has
0.62
represents
0.61
POSITIVE LOGITS
були
1.43
jsou
1.42
можуть
1.35
ovat
1.34
έχουν
1.32
были
1.32
vannak
1.32
都可以
1.32
đều
1.32
were
1.31
Activations Density 0.347%