INDEX
    Explanations

    terms related to equivalence and classification

    New Auto-Interp
    Negative Logits
    antu
    -0.14
    onyms
    -0.14
    á»ĥ
    -0.14
     orth
    -0.14
    ìķĶ
    -0.13
    imedia
    -0.13
     Checklist
    -0.13
     Orth
    -0.13
    istrov
    -0.12
    _CRITICAL
    -0.12
    POSITIVE LOGITS
    oc
    0.80
     oc
    0.77
    OC
    0.72
     Oc
    0.69
     OC
    0.68
    ok
    0.60
    _oc
    0.59
    .oc
    0.59
     occ
    0.57
    occ
    0.57
    Act Density 0.188%

    No Known Activations