INDEX
Explanations
names of famous individuals and entities
New Auto-Interp
Negative Logits
USSR
-0.70
Lanka
-0.68
FACE
-0.66
communism
-0.65
Communism
-0.65
Reviewer
-0.65
Argentina
-0.65
»Ĵ
-0.63
perty
-0.62
Egyptians
-0.62
POSITIVE LOGITS
andowski
0.85
iggs
0.82
oyer
0.77
ache
0.73
aghan
0.72
govtrack
0.72
ensing
0.71
ulo
0.71
ough
0.70
auld
0.68
Activations Density 0.099%