INDEX
Explanations
names of political leaders
names of political leaders
New Auto-Interp
Negative Logits
DragonMagazine
-0.90
pole
-0.76
agents
-0.70
pmwiki
-0.70
Offline
-0.66
Redditor
-0.65
00200000
-0.64
00000000
-0.63
ryu
-0.62
00007
-0.62
POSITIVE LOGITS
Jinping
0.94
Mans
0.74
imar
0.73
Aden
0.72
Sar
0.71
llor
0.71
Salman
0.67
Rut
0.66
Tr
0.66
Marshall
0.66
Activations Density 0.155%