INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    real
    -0.08
    it
    -0.08
    i
    -0.08
    or
    -0.07
    'It
    -0.07
    Io
    -0.07
    Real
    -0.07
    áf
    -0.07
    r
    -0.07
    bi
    -0.07
    POSITIVE LOGITS
    ness
    0.12
    iveness
    0.10
    eness
    0.10
    Mass
    0.10
     Madness
    0.09
     Mass
    0.09
    NESS
    0.08
     effectiveness
    0.08
     Thickness
    0.08
    ess
    0.08
    Act Density 0.048%

    No Known Activations