INDEX
    Explanations

    seriously hurt / borderline / like

    New Auto-Interp
    Negative Logits
    ↵↵
    0.91
     
    0.78
    .
    0.75
     (
    0.73
     none
    0.67
     ink
    0.67
    ,
    0.65
     s
    0.64
    0.62
     p
    0.61
    POSITIVE LOGITS
    Concini
    1.31
    abhavena
    1.28
    akkhanam
    1.27
    ټبال
    1.27
    <unused114>
    1.26
    avasena
    1.23
    abhuto
    1.23
    <unused78>
    1.22
    1.20
    ifères
    1.20
    Act Density 0.001%

    No Known Activations