INDEX
Explanations
words related to politics and government
references to names, organizations, and items related to sports and research
New Auto-Interp
Negative Logits
ournals
-0.66
camel
-0.52
Aram
-0.49
slump
-0.48
ilyn
-0.47
cour
-0.46
Pixar
-0.46
amar
-0.46
MAC
-0.46
tails
-0.46
POSITIVE LOGITS
ibrary
0.70
abulary
0.60
overe
0.60
achev
0.60
henko
0.57
ombat
0.57
lycer
0.57
ãĥĺãĥ©
0.55
QL
0.55
gage
0.53
Activations Density 1.043%