INDEX
    Explanations

    questions or statements containing contractions

    New Auto-Interp
    Negative Logits
    ortunately
    -0.91
     withd
    -0.86
     exha
    -0.83
     rall
    -0.82
     newcom
    -0.78
    anwhile
    -0.78
     eleph
    -0.74
    Þ
    -0.71
     exting
    -0.71
     confir
    -0.70
    POSITIVE LOGITS
    't
    1.67
    ny
    0.92
    ovan
    0.90
    ada
    0.87
    ÃŃ
    0.86
    ned
    0.78
    athan
    0.78
    ALD
    0.77
    thia
    0.77
    na
    0.77
    Act Density 0.055%

    No Known Activations