INDEX
Explanations
instances of simultaneous actions or states involving multiple participants
New Auto-Interp
Negative Logits
951
-0.17
378
-0.16
207
-0.14
ensis
-0.14
Marble
-0.14
709
-0.14
642
-0.14
portion
-0.14
ached
-0.14
ober
-0.14
POSITIVE LOGITS
dere
0.19
omon
0.15
vez
0.15
ľ´
0.15
ey
0.14
DownLatch
0.14
ledon
0.14
fart
0.14
ongo
0.13
reeze
0.13
Activations Density 0.520%