INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .
    3.22
    2.14
    :
    2.14
    ,
    2.02
    1.99
    ;
    1.96
    (
    1.90
    .<
    1.87
    .`
    1.86
    .—
    1.85
    POSITIVE LOGITS
     housed
    2.27
    Labeled
    2.10
     comprised
    2.08
     devoid
    2.01
     idling
    1.99
     nameless
    1.98
    composed
    1.97
     muffled
    1.95
     lumped
    1.92
     lacking
    1.91
    Act Density 0.347%

    No Known Activations