INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    doin
    -1.39
     houſe
    -0.97
     itſelf
    -0.96
     Shakspeare
    -0.96
     Efq
    -0.89
     contextLoads
    -0.89
     Jefus
    -0.89
    ſelf
    -0.88
     Houſe
    -0.88
     AssemblyCulture
    -0.86
    POSITIVE LOGITS
    ap
    0.59
    ley
    0.58
    um
    0.58
    av
    0.58
    am
    0.57
     to
    0.56
    age
    0.56
    ak
    0.54
    ach
    0.54
    ly
    0.53
    Act Density 0.083%

    No Known Activations