INDEX
    Explanations

    conditional phrases and expressions of uncertainty or past choices

    New Auto-Interp
    Negative Logits
    ymes
    -0.16
    éis
    -0.15
    gili
    -0.14
    íĥģ
    -0.14
    engin
    -0.14
    nox
    -0.14
    dont
    -0.14
    wil
    -0.14
    ytt
    -0.14
    jen
    -0.14
    POSITIVE LOGITS
    've
    0.65
    ’ve
    0.52
    a
    0.44
    'a
    0.44
    ve
    0.41
    ’a
    0.35
    'd
    0.34
    а
    0.27
    ta
    0.27
    da
    0.27
    Act Density 0.124%

    No Known Activations