INDEX
    Explanations

    phrases related to enabling actions or functionalities

    phrases indicating the functionality or capabilities of tools and applications

    New Auto-Interp
    Negative Logits
    boy
    -0.79
    ta
    -0.71
    borough
    -0.70
    bil
    -0.67
    wa
    -0.67
    xon
    -0.64
    source
    -0.64
    bons
    -0.64
    tone
    -0.60
    town
    -0.59
    POSITIVE LOGITS
    Reviewer
    0.98
    geries
    0.90
    Allows
    0.83
    uces
    0.77
     us
    0.77
    hift
    0.73
     withdrawals
    0.72
    ibaba
    0.72
    bidden
    0.71
    ences
    0.71
    Act Density 0.043%

    No Known Activations