INDEX
    Explanations

    references to total amounts or quantities

    New Auto-Interp
    Negative Logits
    elu
    -0.18
    es
    -0.17
    link
    -0.17
    etail
    -0.15
    lie
    -0.15
    995
    -0.14
    ams
    -0.14
    eyse
    -0.14
    ile
    -0.14
    kh
    -0.14
    POSITIVE LOGITS
    itarian
    0.27
    led
    0.26
    izers
    0.19
    izador
    0.19
     strangers
    0.19
    LED
    0.19
    oref
    0.19
    isateur
    0.18
    izing
    0.18
    isers
    0.18
    Act Density 0.026%

    No Known Activations