INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     welcome
    -0.73
    are
    -0.71
    ARE
    -0.65
    erv
    -0.53
     January
    -0.53
    tare
    -0.52
    aren
    -0.51
     December
    -0.50
     perception
    -0.50
    REB
    -0.48
    POSITIVE LOGITS
     للمعارف
    0.76
    ftagPool
    0.73
    adpleegd
    0.69
    béco
    0.65
     Himo
    0.63
    expandindo
    0.63
    OGND
    0.62
    GEBURTSDATUM
    0.61
     AssemblyProduct
    0.58
     Мексичка
    0.57
    Act Density 1.602%

    No Known Activations