INDEX
    Explanations

    arts, entertainment & tech

    New Auto-Interp
    Negative Logits
     pleaſure
    -1.07
     avoient
    -1.02
     ſche
    -1.00
     étoient
    -0.98
     itſelf
    -0.96
     étoit
    -0.95
     houſe
    -0.94
     purpoſe
    -0.92
     pouvoit
    -0.91
     moschino
    -0.91
    POSITIVE LOGITS
     of
    0.63
    ,
    0.57
     be
    0.57
    ed
    0.54
     qu
    0.54
    ftagPool
    0.54
     in
    0.53
     more
    0.53
    WriteTagHelper
    0.52
    ')]
    0.50
    Act Density 0.073%

    No Known Activations