INDEX
    Explanations

    phrases that introduce or emphasize information

    New Auto-Interp
    Negative Logits
    SharedDtor
    -0.58
     ogóle
    -0.55
    MLLoader
    -0.53
     verbatim
    -0.51
     dlaczego
    -0.50
     bienven
    -0.50
     része
    -0.50
     kenapa
    -0.49
     Lindberg
    -0.49
    LayoutConstraint
    -0.49
    POSITIVE LOGITS
     With
    0.98
    With
    0.94
    有了
    0.79
    ostante
    0.78
    WITH
    0.69
     ligiloj
    0.68
     Avec
    0.67
     knowing
    0.67
    InputBorder
    0.67
    enumeration
    0.65
    Act Density 0.091%

    No Known Activations