INDEX
Negative Logits
آہ
0.70
चिंतन
0.66
समृ
0.66
érir
0.63
ighed
0.61
campionato
0.61
portfolio
0.60
सुम
0.59
amssymb
0.59
VIP
0.59
POSITIVE LOGITS
fake
2.98
deception
2.97
false
2.91
falsehood
2.89
deceptive
2.82
deceit
2.79
misleading
2.73
deceiving
2.66
fals
2.61
deceive
2.58
Activations Density 0.809%