INDEX
Explanations
expressions indicating a negative preference or avoidance
statements expressing a desire or reluctance regarding a specific action
New Auto-Interp
Negative Logits
issance
-0.81
ework
-0.74
bard
-0.70
hiba
-0.70
soDeliveryDate
-0.69
Enhancement
-0.69
metadata
-0.68
VERTISEMENT
-0.67
icol
-0.67
strength
-0.67
POSITIVE LOGITS
anymore
0.93
anybody
0.91
anyone
0.85
anything
0.85
reprene
0.83
nor
0.80
any
0.77
ANY
0.76
necessarily
0.75
surprises
0.73
Activations Density 0.041%