INDEX
Explanations
phrases related to clarification or direct statements
expressions related to accountability and transparency in communication
New Auto-Interp
Negative Logits
Created
-0.73
Seen
-0.62
Purch
-0.61
Transactions
-0.61
conservancy
-0.61
soDeliveryDate
-0.59
Located
-0.59
Lumpur
-0.58
IDs
-0.57
effects
-0.56
POSITIVE LOGITS
sarcast
1.07
stating
1.05
explaining
1.05
mentioning
1.00
rhet
1.00
praising
0.98
apologizing
0.98
criticizing
0.97
reiter
0.96
:"
0.96
Activations Density 0.348%