INDEX
    Explanations

    accounts or statements made by spokespersons

    the word "spokesman" and its variations in reporting contexts

    New Auto-Interp
    Negative Logits
    edu
    -0.77
    Reward
    -0.73
     gorge
    -0.72
    notations
    -0.71
    llah
    -0.70
     repaid
    -0.69
    ptions
    -0.67
    venants
    -0.65
    ibe
    -0.64
    hu
    -0.64
    POSITIVE LOGITS
    bidden
    0.96
     instance
    0.82
     example
    0.81
    gotten
    0.80
     STATS
    0.78
    cing
    0.78
     managing
    0.78
    cers
    0.77
     Sierra
    0.76
     defending
    0.75
    Act Density 0.105%

    No Known Activations