INDEX
    Explanations

    phrases indicating opinion or intent

    New Auto-Interp
    Negative Logits
    PerformLayout
    -0.71
    ylvan
    -0.71
    NonQuery
    -0.65
    LookAnd
    -0.63
    sprozess
    -0.62
    AFFIRMED
    -0.61
     Tasche
    -0.60
     Races
    -0.60
     Delayed
    -0.60
    っぱり
    -0.59
    POSITIVE LOGITS
     means
    1.42
     mean
    1.36
    means
    1.28
     Means
    1.20
    Means
    1.17
     MEANS
    1.04
     Mean
    0.98
    mean
    0.98
    Mean
    0.94
     meant
    0.94
    Act Density 0.109%

    No Known Activations