INDEX
Explanations
instances of invitations or requests involving the word "to"
New Auto-Interp
Negative Logits
rag
-0.81
borne
-0.70
worn
-0.70
enforcement
-0.67
Pg
-0.66
ray
-0.64
entimes
-0.64
gradient
-0.64
indicators
-0.63
pointer
-0.63
POSITIVE LOGITS
participate
1.12
join
1.11
dinner
0.98
partake
0.94
meet
0.91
explore
0.90
celebrate
0.90
testify
0.86
congratulate
0.86
nominate
0.85
Activations Density 0.049%