INDEX
Explanations
references to quantities or measurements involving the number two
New Auto-Interp
Negative Logits
all
-0.21
242
-0.16
onda
-0.16
various
-0.16
aura
-0.16
overlaps
-0.15
DIS
-0.15
rees
-0.14
ior
-0.14
ALL
-0.14
POSITIVE LOGITS
-two
0.28
two
0.26
两个
0.26
respectively
0.25
åĪĨåĪ«
0.25
beide
0.24
two
0.24
两人
0.23
Both
0.23
beiden
0.22
Activations Density 0.825%