INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     scient
    -0.70
     technologically
    -0.69
     awoken
    -0.66
     penal
    -0.65
     productive
    -0.62
     copyrighted
    -0.62
     psychologically
    -0.61
    porter
    -0.61
     reasonably
    -0.61
     economically
    -0.60
    POSITIVE LOGITS
     biscuits
    0.69
    itone
    0.68
    Shut
    0.64
     Lion
    0.63
     Doodle
    0.63
    Stars
    0.63
     Pavilion
    0.63
    ertodd
    0.63
    horn
    0.62
    river
    0.62
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.