INDEX
    Explanations

    affirmative responses or confirmations to questions

    New Auto-Interp
    Negative Logits
    ittel
    -0.17
    umba
    -0.15
     Esp
    -0.14
    ej
    -0.14
    hyth
    -0.14
    aly
    -0.14
    ÙĨج
    -0.14
    Attrib
    -0.14
    arty
    -0.14
    ez
    -0.14
    POSITIVE LOGITS
    óst
    0.14
     mo
    0.14
     Lehr
    0.14
    _reduction
    0.14
    oslav
    0.13
    éİ
    0.13
    oproject
    0.13
    ordes
    0.13
    Trace
    0.13
    reak
    0.13
    Act Density 0.101%

    No Known Activations