INDEX
Explanations
phrases that indicate deception or manipulation in communication
New Auto-Interp
Negative Logits
myſelf
-0.93
itſelf
-0.84
Efq
-0.84
Jefus
-0.84
houſe
-0.79
Monfieur
-0.71
Houſe
-0.71
preſent
-0.70
ſelf
-0.69
himſelf
-0.68
POSITIVE LOGITS
croire
0.77
rằng
0.70
estimés
0.63
UrlResolution
0.59
AspNetCore
0.59
that
0.57
bahwa
0.56
qtype
0.53
claims
0.52
مب
0.52
Activations Density 0.195%