INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     
    0.95
     a
    0.91
     actual
    0.91
     interesting
    0.86
     b
    0.85
     pe
    0.82
     specific
    0.82
     food
    0.80
     be
    0.78
     counter
    0.77
    POSITIVE LOGITS
    <unused463>
    1.30
    ِمض
    1.29
    <unused2057>
    1.23
    <unused207>
    1.23
    <unused422>
    1.23
    <unused145>
    1.20
    <unused524>
    1.20
    ್ಣ
    1.19
    \|,
    1.18
    <unused1809>
    1.18
    Act Density 0.000%

    No Known Activations