INDEX
    Explanations

    phrases introducing examples or additional points

    New Auto-Interp
    Negative Logits
    258
    -0.13
     OTHERWISE
    -0.13
    jam
    -0.13
    .Areas
    -0.13
     eigentlich
    -0.13
    omo
    -0.12
    ishi
    -0.12
     már
    -0.12
    z
    -0.12
    rending
    -0.12
    POSITIVE LOGITS
     equally
    0.18
     important
    0.18
    ãĤĤãģĨ
    0.17
     similarly
    0.17
     yine
    0.17
    crollView
    0.16
     Important
    0.16
     ebenfalls
    0.15
    apons
    0.15
    ihu
    0.15
    Act Density 0.071%

    No Known Activations