INDEX
    Explanations

    studies/demographics

    New Auto-Interp
    Negative Logits
     girls
    -1.18
     itſelf
    -1.16
     boys
    -1.13
     Shakspeare
    -1.12
     myſelf
    -1.10
     дописавши
    -1.08
    aarrggbb
    -1.05
     Monfieur
    -1.05
     ladies
    -1.02
     ſche
    -1.01
    POSITIVE LOGITS
     in
    0.75
     to
    0.58
     a
    0.54
    ,
    0.54
     the
    0.54
     ex
    0.52
     l
    0.50
     la
    0.49
     and
    0.48
     an
    0.48
    Act Density 0.127%

    No Known Activations