INDEX
    Explanations

    that principle or core principle

    New Auto-Interp
    Negative Logits
     requires
    1.28
     refers
    1.28
     makes
    1.26
     denotes
    1.25
     indicates
    1.24
     implies
    1.22
     describes
    1.22
     doesn
    1.19
     suggests
    1.18
     does
    1.17
    POSITIVE LOGITS
     с
    0.73
    io
    0.72
    ial
    0.70
    у
    0.64
    .\
    0.64
    wonderful
    0.61
    ։
    0.60
    ut
    0.60
    безпе
    0.57
    .
    0.57
    Act Density 0.302%

    No Known Activations