INDEX
    Explanations

    phrases emphasizing the presence or absence of factors

    New Auto-Interp
    Negative Logits
     been
    -0.56
     simple
    -0.53
    ว่า
    -0.53
    mat
    -0.52
    not
    -0.52
    awtextra
    -0.52
     thought
    -0.51
    JUnit
    -0.51
     Kitch
    -0.51
     this
    -0.50
    POSITIVE LOGITS
    LookAnd
    0.80
     beginnetje
    0.72
     חיצוניים
    0.71
     سكانية
    0.68
     Heere
    0.68
     Wass
    0.66
     impunity
    0.66
    AndEndTag
    0.66
     combineReducers
    0.65
    Statistiche
    0.65
    Act Density 0.008%

    No Known Activations