INDEX
Negative Logits
或
-0.07
rush
-0.07
arser
-0.07
reforms
-0.07
strat
-0.06
ard
-0.06
dete
-0.06
Por
-0.06
発売
-0.06
را
-0.06
POSITIVE LOGITS
Obviously
0.12
Obviously
0.11
obviously
0.09
FOX
0.07
smarty
0.07
viously
0.07
Něm
0.07
Clearly
0.06
Eb
0.06
ochrome
0.06
Activations Density 0.006%