INDEX
Explanations
verbs related to actions of force or compulsion
words related to driving or motivation
New Auto-Interp
Negative Logits
Seym
-0.96
ereo
-0.74
çĦ
-0.74
roma
-0.70
umbn
-0.68
aido
-0.67
yip
-0.64
iao
-0.64
iannopoulos
-0.64
Lumpur
-0.63
POSITIVE LOGITS
driving
0.86
away
0.84
wedge
0.78
bike
0.74
driving
0.74
wheel
0.74
train
0.72
away
0.71
ousel
0.69
driven
0.68
Activations Density 0.036%