INDEX
Explanations
phrases indicating behaviors or actions that involve some variability or diversity, particularly those that can have multiple interpretations
New Auto-Interp
Negative Logits
zw
-0.17
istrovstvÃŃ
-0.17
ector
-0.15
omor
-0.14
اÙĬØ©
-0.14
assa
-0.14
utter
-0.14
ázi
-0.14
mostly
-0.13
Ahmet
-0.13
POSITIVE LOGITS
may
0.16
might
0.15
-Length
0.15
-times
0.15
already
0.15
very
0.14
even
0.14
çĶļèĩ³
0.14
already
0.14
diluted
0.14
Activations Density 0.255%