INDEX
Negative Logits
location
-0.07
-list
-0.07
locations
-0.06
苗
-0.06
unary
-0.06
sunset
-0.06
Mann
-0.06
nim
-0.06
worm
-0.06
بول
-0.06
POSITIVE LOGITS
ethical
0.11
ethics
0.11
Eth
0.09
ethical
0.09
Ethics
0.09
esthetic
0.08
unethical
0.08
Eth
0.08
фил
0.08
ethic
0.08
Activations Density 0.016%