INDEX
Negative Logits
Plate
-0.81
comet
-0.79
хьтан
-0.79
propOrder
-0.78
plate
-0.75
Comet
-0.74
spotify
-0.72
saraba
-0.71
Plate
-0.69
FRAME
-0.68
POSITIVE LOGITS
word
0.50
post
0.47
war
0.46
iness
0.43
setcounter
0.43
proceeds
0.42
let
0.42
load
0.42
/
0.41
times
0.41
Activations Density 0.251%