INDEX
    Explanations

    phrases starting with "According" followed by information or attribution

    statements that attribute information or claims to sources

    New Auto-Interp
    Negative Logits
    DOWN
    -0.62
     godd
    -0.58
     heights
    -0.53
    âϦ
    -0.53
     eleph
    -0.53
     mathemat
    -0.53
     helicop
    -0.52
     swe
    -0.51
     gobl
    -0.50
     submar
    -0.50
    POSITIVE LOGITS
    ly
    1.14
     to
    0.95
    edly
    0.86
    liest
    0.82
    gest
    0.75
    lly
    0.74
    itionally
    0.68
    translation
    0.68
    LY
    0.67
    ities
    0.66
    Act Density 0.035%

    No Known Activations