INDEX
Explanations
names of individuals or companies
names of individuals and organizations associated with media or political contexts
New Auto-Interp
Negative Logits
perature
-0.74
eworld
-0.73
worldly
-0.71
76561
-0.71
aries
-0.69
ethy
-0.67
orical
-0.65
ories
-0.65
unal
-0.64
quot
-0.63
POSITIVE LOGITS
Perkins
1.08
burgh
0.86
ledged
0.77
nown
0.76
bury
0.76
burg
0.75
linger
0.75
Finch
0.73
heim
0.73
kins
0.73
Activations Density 0.010%