INDEX
Explanations
verbs related to deliberate actions performed with a specific purpose
verbs associated with actions and operations affecting systems or situations
New Auto-Interp
Negative Logits
tone
-0.67
conn
-0.64
zo
-0.58
gins
-0.58
eur
-0.56
SW
-0.53
اÙĦ
-0.51
ricks
-0.51
nex
-0.51
ele
-0.50
POSITIVE LOGITS
ometimes
1.02
omething
0.92
hift
0.89
paces
0.78
ynthesis
0.76
ilver
0.75
hirt
0.73
heet
0.73
ettings
0.69
terness
0.66
Activations Density 0.390%