INDEX
    Explanations

    references to specific categories or types of items

    references to examples or types of things

    New Auto-Interp
    Negative Logits
    olute
    -0.68
    YR
    -0.68
    inka
    -0.62
     somew
    -0.61
    onel
    -0.61
    eland
    -0.58
    olina
    -0.58
    enger
    -0.57
    orem
    -0.57
    ysical
    -0.57
    POSITIVE LOGITS
    ties
    0.85
    ities
    0.77
    deals
    0.67
    Flag
    0.63
    sword
    0.62
    inyl
    0.61
    complex
    0.61
    Arg
    0.60
     gems
    0.58
     outsourcing
    0.57
    Act Density 0.043%

    No Known Activations