INDEX
Negative Logits
neurons
-0.07
Erdoğan
-0.07
�
-0.06
spend
-0.06
calming
-0.06
کمتر
-0.06
偷
-0.06
contagious
-0.06
extr
-0.06
araoh
-0.06
POSITIVE LOGITS
Glob
0.07
tg
0.07
�璃
0.07
schizophren
0.07
bab
0.06
Tem
0.06
(';0.06
gentle
0.06
งค
0.06
(tk
0.06
Activations Density 0.001%