INDEX
Negative Logits
manners
0.69
negligently
0.68
miraculous
0.67
любые
0.64
negligent
0.63
any
0.63
negligence
0.63
falle
0.62
любы
0.62
も
0.62
POSITIVE LOGITS
vdash
0.61
あなたは
0.58
libc
0.57
":"
0.56
perché
0.56
#:
0.56
Steep
0.55
あなた
0.55
资产
0.55
gerät
0.54
Activations Density 0.066%