INDEX
    Explanations

    blog post openings

    New Auto-Interp
    Negative Logits
    -0.08
     шы
    -0.08
    Bog
    -0.08
     прост
    -0.08
     beware
    -0.07
    мен
    -0.07
     tart
    -0.07
    -0.07
     gotta
    -0.07
     Harrison
    -0.07
    POSITIVE LOGITS
     ofrec
    0.08
     MMS
    0.08
    .Then
    0.08
    0.07
     â
    0.07
    riers
    0.07
    ζί
    0.07
     outright
    0.07
     Crist
    0.07
     ope
    0.07
    Act Density 0.251%

    No Known Activations