INDEX
Negative Logits
ī
0.44
ders
0.41
ded
0.41
dimension
0.40
municipality
0.39
triumph
0.38
ointed
0.38
deling
0.38
clesiastical
0.38
depart
0.37
POSITIVE LOGITS
栲
0.43
ਸਿੰ
0.43
보도록
0.42
睪
0.42
беремен
0.42
calific
0.40
постро
0.40
கொடுத்து
0.40
борь
0.39
TESTING
0.39
Activations Density 0.001%