INDEX
Explanations
actions related to promotion and advocacy
New Auto-Interp
Negative Logits
-0.73
two
-0.69
most
-0.67
Dieter
-0.63
y
-0.62
mtable
-0.62
ara
-0.61
FE
-0.61
not
-0.60
晴
-0.60
POSITIVE LOGITS
promotion
1.59
Promoted
1.58
Promote
1.54
promoted
1.50
promotions
1.48
Promotion
1.47
promoting
1.45
Promote
1.45
promoted
1.41
promotes
1.40
Activations Density 0.093%