INDEX
    Explanations

    numerical values related to significant measurements or quantities

    New Auto-Interp
    Negative Logits
    able
    -0.17
    ands
    -0.15
    ctrine
    -0.15
    ering
    -0.14
    edImage
    -0.14
    iene
    -0.14
    ucch
    -0.14
    تÙħر
    -0.14
    bere
    -0.13
    ront
    -0.13
    POSITIVE LOGITS
    0
    0.39
    00
    0.29
    8
    0.29
    9
    0.29
    7
    0.28
    5
    0.27
    6
    0.27
    4
    0.26
    3
    0.26
    2
    0.24
    Act Density 0.138%

    No Known Activations