INDEX
Explanations
names of people and their affiliations
New Auto-Interp
Negative Logits
roit
-0.16
arget
-0.15
onium
-0.15
unta
-0.14
_drop
-0.14
irate
-0.14
pio
-0.14
اÙĦرÙħزÙĬØ©
-0.14
émon
-0.14
éĽ
-0.14
POSITIVE LOGITS
Morav
0.22
Pour
0.19
pour
0.18
Rae
0.17
aliz
0.17
sole
0.16
Pour
0.16
çµIJ
0.16
ollah
0.16
iasi
0.16
Activations Density 0.029%