INDEX
Explanations
words related to control or influence
expressions related to control and influence over various entities or aspects
New Auto-Interp
Negative Logits
ullah
-0.76
rehend
-0.67
ritz
-0.64
onna
-0.64
udes
-0.64
mpeg
-0.63
mart
-0.63
reet
-0.62
uncle
-0.61
offer
-0.60
POSITIVE LOGITS
reins
0.90
levers
0.89
redist
0.78
eering
0.77
steering
0.76
destiny
0.75
finances
0.71
orship
0.70
succession
0.70
wheel
0.68
Activations Density 0.150%