INDEX
Negative Logits
.We
-0.08
Olson
-0.07
expenses
-0.07
expense
-0.07
�
-0.07
Coal
-0.07
WK
-0.06
Standards
-0.06
编
-0.06
الص
-0.06
POSITIVE LOGITS
mirror
0.16
mirrors
0.13
Mir
0.13
Mirror
0.13
Mir
0.11
Miranda
0.11
mir
0.10
mirror
0.10
Mirror
0.09
Ariel
0.08
Activations Density 0.006%