INDEX
    Explanations

    adjectives describing qualities or characteristics

    New Auto-Interp
    Negative Logits
    HAEL
    -0.74
    ULTS
    -0.71
    anwhile
    -0.67
    interrupted
    -0.65
     destro
    -0.63
    lished
    -0.63
    kefeller
    -0.61
    ELL
    -0.61
    BAT
    -0.60
    arks
    -0.60
    POSITIVE LOGITS
    entially
    0.84
    able
    0.79
    ative
    0.78
    ãĤ¦ãĤ¹
    0.77
    ically
    0.74
    liness
    0.72
    oscope
    0.70
    phabet
    0.70
    abouts
    0.69
    hing
    0.68
    Act Density 0.015%

    No Known Activations