INDEX
Negative Logits
telev
-0.07
fruitful
-0.07
Stop
-0.07
gonna
-0.07
kuvvet
-0.07
技
-0.07
Spot
-0.07
-song
-0.06
万
-0.06
;set
-0.06
POSITIVE LOGITS
adher
0.10
adhere
0.09
adherence
0.08
devices
0.07
aders
0.07
Refer
0.07
herent
0.06
Neither
0.06
Responses
0.06
addresses
0.06
Activations Density 0.004%