INDEX
    Explanations

    numbers preceded by apostrophes

    negations or words indicating what is not true or should not happen

    New Auto-Interp
    Negative Logits
     laun
    -0.75
     mortg
    -0.70
     Kirin
    -0.68
     Cinderella
    -0.67
     helicop
    -0.67
     princ
    -0.67
     nomine
    -0.66
     Duo
    -0.58
     Powered
    -0.58
     convol
    -0.56
    POSITIVE LOGITS
    't
    1.58
    aturally
    1.17
    itely
    1.16
    aught
    1.09
    ately
    1.07
    ought
    0.92
    fortunately
    0.91
    omore
    0.90
    ighter
    0.87
    ÃŃ
    0.86
    Act Density 0.016%

    No Known Activations