INDEX
    Explanations

    auxiliary verbs

    New Auto-Interp
    Negative Logits
    — 
    -0.78
    NUMX
    -0.75
    ſelf
    -0.74
     Efq
    -0.74
    -0.73
    }{*}{}
    -0.72
     سكانية
    -0.71
    mặt
    -0.69
     itſelf
    -0.69
    ± 
    -0.67
    POSITIVE LOGITS
    ↵↵
    0.70
    <eos>
    0.65
     adipis
    0.58
    1
    0.57
    cena
    0.54
     Erişim
    0.54
     *
    0.54
    2
    0.54
    0.52
    8
    0.49
    Act Density 0.123%

    No Known Activations