INDEX
Explanations
instances of passive voice and references to individuals in various contexts
New Auto-Interp
Negative Logits
.flink
-0.17
Filled
-0.17
argon
-0.15
iedo
-0.15
licer
-0.15
ÙħÙĦÙĬ
-0.14
andest
-0.14
onders
-0.14
lassian
-0.14
ãĥ¼ãĥĨ
-0.14
POSITIVE LOGITS
selected
0.29
allowed
0.28
admitted
0.27
chosen
0.27
given
0.26
brought
0.25
sent
0.25
selected
0.24
allowed
0.23
appointed
0.22
Activations Density 0.444%