INDEX
    Explanations

    references to frequency and quantity

    New Auto-Interp
    Negative Logits
     resp
    -0.16
    svc
    -0.15
    dict
    -0.15
    exion
    -0.14
    ué
    -0.14
    kir
    -0.14
    proof
    -0.14
    esh
    -0.14
    rette
    -0.14
    erval
    -0.13
    POSITIVE LOGITS
    rát
    0.17
    gere
    0.16
    fold
    0.16
    ignKey
    0.15
     Garrison
    0.15
    /all
    0.15
    igu
    0.14
     Tento
    0.14
    تز
    0.14
    нÑı
    0.14
    Act Density 0.124%

    No Known Activations