INDEX
Explanations
phrases related to influence or power
mentions of influence across various contexts and entities
New Auto-Interp
Negative Logits
TAG
-0.84
ITIES
-0.76
ft
-0.75
leigh
-0.72
yll
-0.70
Quotes
-0.68
atri
-0.68
Simple
-0.68
TH
-0.67
Dill
-0.66
POSITIVE LOGITS
pedd
1.16
influence
0.99
cooker
0.98
influencing
0.96
sway
0.92
exerted
0.88
influences
0.82
shaping
0.82
influenced
0.78
multiplier
0.74
Activations Density 0.034%