INDEX
    Explanations

    statements of affirmation or claims of dominance

    New Auto-Interp
    Negative Logits
    esser
    -0.19
    .Restr
    -0.15
    ics
    -0.15
    .travel
    -0.14
    StateException
    -0.14
    INTR
    -0.14
    erman
    -0.13
    ειο
    -0.13
    stick
    -0.13
    TERM
    -0.13
    POSITIVE LOGITS
    245
    0.17
    alat
    0.17
    urd
    0.15
    baugh
    0.15
    ance
    0.15
    andalone
    0.15
    648
    0.14
    267
    0.14
    /question
    0.14
    inker
    0.14
    Act Density 0.012%

    No Known Activations