INDEX
Explanations
phrases related to rejection or refusal
phrases related to decisions or proposals
New Auto-Interp
Negative Logits
Travels
-0.74
ukong
-0.74
Refuge
-0.66
Transparency
-0.61
Transgender
-0.60
Poverty
-0.60
ructose
-0.59
urches
-0.59
Highlights
-0.59
SHARES
-0.57
POSITIVE LOGITS
nonetheless
1.20
nevertheless
0.98
etheless
0.78
thereto
0.73
retained
0.71
afterward
0.69
).[
0.67
afterwards
0.67
suffice
0.67
thereafter
0.65
Activations Density 1.335%