INDEX
    Explanations

    phrases involving the word "take" or variations of it

    New Auto-Interp
    Negative Logits
    thern
    -0.17
    uju
    -0.15
    udas
    -0.15
    jad
    -0.15
    ilater
    -0.15
    taj
    -0.15
    PU
    -0.15
    idth
    -0.15
    rine
    -0.15
    ime
    -0.14
    POSITIVE LOGITS
     advantage
    0.36
    aways
    0.24
     seriously
    0.22
    uchi
    0.20
     advant
    0.20
     refuge
    0.20
     charge
    0.20
     Liberties
    0.19
     Advantage
    0.19
     liberties
    0.18
    Act Density 0.113%

    No Known Activations