INDEX
    Explanations

    words related to espionage or spy activities

    New Auto-Interp
    Negative Logits
    urses
    -0.79
    ickr
    -0.77
    ktop
    -0.76
    aii
    -0.72
    keye
    -0.71
    artney
    -0.69
    TAIN
    -0.69
    agne
    -0.68
    ourse
    -0.65
    tin
    -0.65
    POSITIVE LOGITS
    moon
    1.03
    loo
    0.95
    boxing
    0.94
    runners
    0.90
    fax
    0.83
    stats
    0.82
    flame
    0.81
    hun
    0.77
    shadow
    0.75
    runner
    0.74
    Act Density 0.016%

    No Known Activations