INDEX
    Explanations

    instances where something is being done "properly."

    instances of the word "properly" used in various contexts

    New Auto-Interp
    Negative Logits
    =-=-=-=-=-=-=-=-
    -0.79
    RY
    -0.70
    ————————
    -0.69
    _>
    -0.67
    oleon
    -0.67
    yi
    -0.67
    ————————————————
    -0.65
     spont
    -0.65
    tery
    -0.65
    ites
    -0.64
    POSITIVE LOGITS
     behaved
    0.94
     exting
    0.92
     fitted
    0.89
     formatted
    0.86
     aligned
    0.86
     equipped
    0.86
     initialized
    0.85
     segregated
    0.84
     positioned
    0.84
     suited
    0.83
    Act Density 0.012%

    No Known Activations