INDEX
    Explanations

    classes and related terminology in programming context

    New Auto-Interp
    Negative Logits
    _classifier
    -0.16
    tember
    -0.16
    chap
    -0.16
    _classification
    -0.15
    onta
    -0.15
    elson
    -0.15
    vt
    -0.15
    tele
    -0.15
    affles
    -0.15
    eous
    -0.14
    POSITIVE LOGITS
    ses
    0.39
    mate
    0.37
    (es
    0.36
    room
    0.35
    rooms
    0.35
    ifications
    0.35
    ä¼¼
    0.34
    ifying
    0.34
    ifies
    0.32
    ically
    0.29
    Act Density 0.076%

    No Known Activations