INDEX
Explanations
phrases indicating a contrary or negative statement
expressions of negation or uncertainty
New Auto-Interp
Negative Logits
Painting
-0.75
Fuji
-0.73
Strategy
-0.73
Rebellion
-0.71
Inventory
-0.66
è
-0.65
Novel
-0.64
Finder
-0.64
srfAttach
-0.61
Circuit
-0.61
POSITIVE LOGITS
hin
1.18
necessarily
1.07
bother
0.97
icably
0.92
necess
0.89
epad
0.86
qualify
0.86
swayed
0.85
icable
0.85
bothered
0.84
Activations Density 0.053%