INDEX
    Explanations

    words and classes related to programming constructs and types in a large codebase

    New Auto-Interp
    Negative Logits
    ſelves
    -0.95
     itſelf
    -0.86
     myſelf
    -0.85
    ſelf
    -0.83
     ſtate
    -0.83
     pleaſure
    -0.82
     fevere
    -0.80
     Majefty
    -0.79
     ſever
    -0.79
     houſe
    -0.79
    POSITIVE LOGITS
    []
    0.67
     aux
    0.56
     i
    0.56
     itu
    0.54
    inex
    0.54
     p
    0.52
     Nature
    0.51
     nature
    0.51
    ós
    0.50
    Gön
    0.49
    Act Density 0.308%

    No Known Activations