INDEX
    Explanations

    party, advice, maxi, current, help

    New Auto-Interp
    Negative Logits
     “[
    1.06
     (“
    1.03
     (...)
    1.03
     [...]
    1.02
    (...)
    0.96
     ["
    0.94
     [
    0.93
     ("
    0.92
    0.91
     "[
    0.91
    POSITIVE LOGITS
     allah
    0.88
     flew
    0.84
     jesus
    0.84
     july
    0.83
     knew
    0.82
     music
    0.80
     bike
    0.80
     june
    0.80
     vrouw
    0.80
     woman
    0.79
    Act Density 0.187%

    No Known Activations