INDEX
Explanations
prominent figures in the entertainment industry
New Auto-Interp
Negative Logits
sole
-0.15
venir
-0.14
анÑĸÑĤ
-0.13
anship
-0.13
esign
-0.13
arus
-0.13
inheritance
-0.13
awi
-0.13
raç
-0.13
malink
-0.13
POSITIVE LOGITS
himself
0.16
fans
0.15
ův
0.15
fty
0.15
fan
0.14
Zusammen
0.14
udad
0.14
famously
0.13
_VARS
0.13
ÑĤин
0.13
Activations Density 0.179%