INDEX
    Explanations

    actions related to buying, playing, or making decisions

    New Auto-Interp
    Negative Logits
    y
    -0.51
    ho
    -0.44
    </strong>
    -0.40
    ya
    -0.39
    ia
    -0.39
    ds
    -0.38
    ?
    -0.38
    enson
    -0.37
    <eos>
    -0.36
     del
    -0.36
    POSITIVE LOGITS
     canst
    0.89
    ロウィン
    0.68
     queſta
    0.66
     pouvoit
    0.66
    AndEndTag
    0.65
     potest
    0.65
    ſelves
    0.64
    outheast
    0.63
    [@BOS@]
    0.62
    <unused8>
    0.62
    Act Density 0.076%

    No Known Activations