INDEX
    Explanations

    references to individuals and their actions or experiences

    New Auto-Interp
    Negative Logits
    anco
    -0.15
    ëŀį
    -0.15
    .archive
    -0.15
    intelligence
    -0.15
    leur
    -0.14
    _IOC
    -0.14
    ios
    -0.14
    infeld
    -0.14
    ullan
    -0.14
    IO
    -0.14
    POSITIVE LOGITS
    636
    0.17
    awai
    0.17
     Herbert
    0.15
    conto
    0.15
    icÃŃ
    0.14
    Ctl
    0.14
    ux
    0.14
     Newman
    0.14
    Ticker
    0.14
     effort
    0.14
    Act Density 0.004%

    No Known Activations