INDEX
Explanations
mentions of specific companies and individuals in various contexts
phrases associated with personal achievements and confidence in project success
New Auto-Interp
Negative Logits
virginity
-0.84
rape
-0.79
pretended
-0.72
punishments
-0.71
reply
-0.70
edit
-0.68
torture
-0.68
gery
-0.65
shit
-0.65
leftists
-0.64
POSITIVE LOGITS
leveraging
1.00
"}],"
0.97
partnership
0.96
partnering
0.96
innovative
0.90
synerg
0.90
atform
0.87
collaborations
0.87
collaboration
0.87
accelerator
0.87
Activations Density 0.838%