INDEX
    Explanations

    informative statements or explanations

    verbs and phrases indicating the provision of information or summaries

    New Auto-Interp
    Negative Logits
    Initialized
    -0.77
    azo
    -0.76
    doms
    -0.72
    mare
    -0.68
    assault
    -0.68
    psc
    -0.66
    LINE
    -0.64
    amphetamine
    -0.64
    AME
    -0.64
    ahu
    -0.63
    POSITIVE LOGITS
     examples
    1.24
     insights
    1.16
     insight
    1.12
     explanations
    1.11
     detailed
    1.09
     links
    1.07
     insightful
    1.03
     concise
    1.03
     pointers
    1.03
     descriptions
    1.02
    Act Density 0.246%

    No Known Activations