INDEX
Explanations
phrases related to rejection or denial of claims or arguments
New Auto-Interp
Negative Logits
AssemblyProduct
-0.61
up
-0.55
ens
-0.54
└
-0.53
Autowired
-0.53
ึ้น
-0.53
ढ
-0.53
artament
-0.52
образом
-0.52
du
-0.52
POSITIVE LOGITS
rejection
1.47
Reject
1.47
reject
1.46
rejects
1.46
Rejection
1.45
rejecting
1.38
Refuse
1.31
denies
1.31
rejected
1.29
Rejected
1.28
Activations Density 0.247%