INDEX
Negative Logits
stakes
-0.07
heap
-0.07
displacement
-0.07
selection
-0.07
oxidative
-0.06
collect
-0.06
wię
-0.06
Income
-0.06
783
-0.06
punk
-0.06
POSITIVE LOGITS
mirrors
0.11
Mirror
0.11
mirror
0.10
Mirror
0.08
mirror
0.08
러
0.08
mirrored
0.07
र
0.07
镜
0.07
鏡
0.07
Activations Density 0.004%