INDEX
    Explanations

    expressions of personal sentiment or opinions

    New Auto-Interp
    Negative Logits
    ocrates
    -0.16
    cept
    -0.15
    aby
    -0.14
    ments
    -0.14
    igin
    -0.14
    urs
    -0.14
    lse
    -0.14
    obi
    -0.13
    ãģ¸ãģ¨
    -0.13
    ogens
    -0.13
    POSITIVE LOGITS
    reff
    0.16
    ãĥ£
    0.15
    utomation
    0.15
    rane
    0.15
    EIF
    0.14
     treff
    0.14
    erli
    0.14
    TRGL
    0.14
    EAR
    0.14
     ведÑĮ
    0.13
    Act Density 0.182%

    No Known Activations