INDEX
    Explanations

    adjectives or verbs indicating a change in ease or difficulty of a certain action

    phrases indicating the relative difficulty or ease of various actions or regulations

    New Auto-Interp
    Negative Logits
    notations
    -0.65
    Originally
    -0.63
    ilo
    -0.63
    leigh
    -0.60
    agraph
    -0.60
    milo
    -0.60
    Blu
    -0.58
     Difference
    -0.57
    wings
    -0.56
    kind
    -0.56
    POSITIVE LOGITS
     for
    0.79
     to
    0.78
     punishable
    0.71
    IBLE
    0.67
    ible
    0.65
    enged
    0.64
     prey
    0.63
    for
    0.63
     easier
    0.62
    anced
    0.62
    Act Density 0.076%

    No Known Activations