INDEX
Explanations
words indicating inclusivity or exceptions
New Auto-Interp
Negative Logits
nakalista
-0.72
Paglinawan
-0.59
:✨
-0.54
Савезне
-0.54
ujednoznacz
-0.52
Италијани
-0.52
WebVitals
-0.52
tagHelperRunner
-0.51
፩
-0.50
мәкал
-0.49
POSITIVE LOGITS
sekal
0.59
Even
0.46
Even
0.46
even
0.44
kahit
0.42
即便是
0.41
Même
0.40
Bahkan
0.38
Даже
0.38
experimentado
0.38
Activations Density 0.389%