INDEX
Explanations
actions related to failure and obligations
New Auto-Interp
Negative Logits
ویکیپدیا
-0.61
Jind
-0.48
strains
-0.48
RUnlock
-0.47
règles
-0.46
nameof
-0.46
Loren
-0.45
pins
-0.45
kolorze
-0.44
ilon
-0.44
POSITIVE LOGITS
themselves
1.55
themselves
1.28
Their
1.21
their
1.20
Their
1.16
THEIR
1.07
their
1.03
Leur
0.87
selves
0.86
autorytatywna
0.84
Activations Density 0.496%