INDEX
Negative Logits
staining
-0.08
ийн
-0.08
ેપ
-0.08
greatness
-0.08
aromatic
-0.07
trekt
-0.07
,response
-0.07
więc
-0.07
kæ
-0.07
vraagt
-0.07
POSITIVE LOGITS
disguised
0.15
guise
0.15
disguis
0.13
innoc
0.13
disguise
0.11
unsus
0.11
deceptive
0.10
decept
0.10
camouflage
0.10
embed
0.10
Activations Density 0.085%