INDEX
Explanations
instances of persuasion or convincing in various contexts
New Auto-Interp
Negative Logits
gie
-0.17
olen
-0.15
igest
-0.14
fü
-0.14
ICI
-0.14
883
-0.14
iese
-0.14
DMI
-0.14
yms
-0.14
ifice
-0.13
POSITIVE LOGITS
convin
0.24
convinced
0.22
convince
0.22
Conv
0.21
ively
0.20
conv
0.20
persu
0.19
convincing
0.19
Conv
0.18
persuade
0.18
Activations Density 0.021%