INDEX
    Explanations

    references to a specific news agency

    references to a specific news outlet

    New Auto-Interp
    Negative Logits
    aire
    -0.82
    ary
    -0.80
    istics
    -0.78
    istas
    -0.75
    mans
    -0.71
    selves
    -0.70
    ende
    -0.68
    acters
    -0.68
     Thrones
    -0.66
    ista
    -0.66
    POSITIVE LOGITS
    BILITY
    1.30
    BLE
    1.10
    BILITIES
    1.03
    zza
    0.99
    ULT
    0.95
    EA
    0.90
    ircraft
    0.85
    xia
    0.85
    HL
    0.84
    ccess
    0.83
    Act Density 0.020%

    No Known Activations