INDEX
Negative Logits
rohkem
0.45
enforcing
0.44
bling
0.40
murderous
0.39
educating
0.39
instructive
0.39
autres
0.39
destitute
0.39
subst
0.39
prots
0.39
POSITIVE LOGITS
alf
0.42
भएका
0.41
onat
0.39
טי
0.38
ona
0.38
bev
0.38
ion
0.37
entuan
0.37
anoj
0.36
one
0.36
Activations Density 0.001%