INDEX
    Explanations

    references to popularity or prevalence in a social context

    numbers, quantities, prices

    New Auto-Interp
    Negative Logits
     Tiefen
    -0.38
     Schicks
    -0.38
     Helden
    -0.37
    ٔ
    -0.35
     ahor
    -0.33
     aldea
    -0.33
     historically
    -0.32
     Auss
    -0.32
    -0.31
     Ruhe
    -0.31
    POSITIVE LOGITS
     שוליים
    0.81
     queſta
    0.79
    rungsseite
    0.73
    <unused41>
    0.69
    <unused14>
    0.68
    <unused8>
    0.68
    <unused43>
    0.68
    <unused47>
    0.68
    [@BOS@]
    0.68
    <unused3>
    0.68
    Act Density 0.001%

    No Known Activations