INDEX
    Explanations

    references to the founding and establishment of organizations or institutions

    New Auto-Interp
    Negative Logits
     plus
    -0.14
    plementation
    -0.14
     åĥ
    -0.14
    tas
    -0.14
    iaux
    -0.13
    itler
    -0.13
     ones
    -0.13
    illas
    -0.13
     overload
    -0.13
    ça
    -0.12
    POSITIVE LOGITS
     initially
    0.23
    initial
    0.22
    -initial
    0.21
     inicial
    0.20
     originally
    0.19
     out
    0.19
    .Initial
    0.18
     Initially
    0.18
    WithName
    0.17
    aim
    0.17
    Act Density 0.121%

    No Known Activations