INDEX
    Explanations

    text related to accents or special characters

    New Auto-Interp
    Negative Logits
     Gemini
    -0.60
     goodwill
    -0.60
     hypers
    -0.59
    é¾įå¥ij士
    -0.59
     welcome
    -0.59
     sensit
    -0.59
    Topic
    -0.58
     entitled
    -0.57
     owl
    -0.56
     noses
    -0.56
    POSITIVE LOGITS
    rm
    0.96
    verend
    0.87
    ggles
    0.84
    ivil
    0.83
    misc
    0.82
    ternity
    0.81
    bably
    0.78
    odo
    0.77
    minist
    0.77
    vez
    0.77
    Act Density 0.050%

    No Known Activations