INDEX
Explanations
instances of the word "speak" or its variations in various contexts
New Auto-Interp
Negative Logits
wide
-0.15
ra
-0.15
fact
-0.15
ecycle
-0.15
Berger
-0.14
gart
-0.14
ft
-0.14
Gould
-0.14
wide
-0.14
igan
-0.13
POSITIVE LOGITS
mixins
0.15
gın
0.15
=-=-=-=-=-=-=-=-
0.14
θι
0.14
umas
0.14
atol
0.14
mixin
0.14
YPRE
0.14
_mB
0.13
ync
0.13
Activations Density 0.021%