INDEX
    Explanations

    phrases indicating certainty or confidence

    phrases indicating certainty or inevitability

    New Auto-Interp
    Negative Logits
    ufact
    -0.77
    vernment
    -0.75
    edia
    -0.66
    intosh
    -0.64
    gdala
    -0.63
    annis
    -0.61
    INT
    -0.59
    olid
    -0.58
    asus
    -0.58
    akespeare
    -0.57
    POSITIVE LOGITS
    ties
    1.13
    fire
    0.98
    footed
    0.95
    ty
    0.94
    sk
    0.89
    faced
    0.86
    stre
    0.73
    stall
    0.72
    blade
    0.71
    ples
    0.71
    Act Density 0.032%

    No Known Activations