INDEX
Explanations
names of people, potentially related to politics or sports
New Auto-Interp
Negative Logits
ainment
-0.87
udeau
-0.82
rera
-0.79
itu
-0.74
Soccer
-0.72
ostics
-0.71
anooga
-0.69
izoph
-0.68
otten
-0.68
uba
-0.67
POSITIVE LOGITS
etheless
0.83
erate
0.74
wyn
0.72
swer
0.72
accompan
0.71
carbohyd
0.71
bye
0.71
quartered
0.71
ed
0.69
escription
0.68
Activations Density 0.043%