INDEX
Negative Logits
TAS
0.39
yson
0.37
বেঙ্গল
0.37
ച്ഛ
0.37
discrimin
0.37
snout
0.37
blur
0.37
ëlle
0.37
verage
0.36
housing
0.36
POSITIVE LOGITS
留言
1.45
message
1.31
messages
1.31
メッセージ
1.25
leaving
1.23
leave
1.21
Leave
1.20
Message
1.16
Leaving
1.16
оставля
1.12
Activations Density 0.009%