INDEX
Explanations
references to specific individuals and their influences in discussions
New Auto-Interp
Negative Logits
timeline
-0.19
impactful
-0.18
iconic
-0.17
backstory
-0.17
onboard
-0.16
headquartered
-0.16
pairwise
-0.16
WWII
-0.16
artwork
-0.16
standalone
-0.15
POSITIVE LOGITS
foregoing
0.22
boasted
0.21
present
0.21
meshes
0.20
pretended
0.20
especial
0.20
whole
0.19
utmost
0.19
chief
0.18
connexion
0.18
Activations Density 0.260%