INDEX
    Explanations

    instances of specific types or classifications

    New Auto-Interp
    Negative Logits
    ATEST
    -0.15
    omic
    -0.15
    ship
    -0.15
    ships
    -0.14
    ű
    -0.14
    iolet
    -0.14
    ninger
    -0.14
    olet
    -0.14
    μμ
    -0.13
    idth
    -0.13
    POSITIVE LOGITS
    ppelin
    0.17
    nga
    0.15
     Maz
    0.15
    yor
    0.15
    ragon
    0.14
     Daly
    0.14
    .cum
    0.14
     overs
    0.13
    erner
    0.13
     æ¼Ķ
    0.13
    Act Density 0.003%

    No Known Activations