INDEX
    Explanations

    definite articles that signify emphasis or distinction

    New Auto-Interp
    Negative Logits
     Very
    -0.17
     VERY
    -0.16
     very
    -0.16
     muy
    -0.16
     Basically
    -0.15
    hazi
    -0.15
    Very
    -0.15
     lẽ
    -0.15
    ania
    -0.15
    енка
    -0.14
    POSITIVE LOGITS
     fault
    0.25
     anymore
    0.25
     necessarily
    0.25
     usual
    0.21
     sort
    0.21
     slightest
    0.20
     kind
    0.20
     nor
    0.20
     same
    0.19
     norm
    0.19
    Act Density 0.055%

    No Known Activations