INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     hostage
    -0.72
    pend
    -0.69
    jad
    -0.64
    prison
    -0.63
    away
    -0.62
    cock
    -0.61
    commit
    -0.61
    ---------
    -0.61
    times
    -0.60
    Bang
    -0.60
    POSITIVE LOGITS
    chery
    0.78
    eria
    0.76
     Beir
    0.75
    ajo
    0.75
    iculture
    0.70
    kefeller
    0.70
     Shinra
    0.68
    icter
    0.68
    ierre
    0.67
    allery
    0.66
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.