INDEX
    Explanations

    questions and references to individuals involved in decision-making or accountability

    New Auto-Interp
    Negative Logits
    only
    -0.19
     only
    -0.18
     лиÑĪÑĮ
    -0.17
     ONLY
    -0.16
     Only
    -0.15
    fst
    -0.15
    à¥ĩवल
    -0.15
     hanya
    -0.15
    Only
    -0.14
     nowhere
    -0.14
    POSITIVE LOGITS
     ultimately
    0.26
     owns
    0.22
     controls
    0.21
     these
    0.20
     Ultimately
    0.20
     they
    0.20
     those
    0.19
     actually
    0.19
     vlastnÄĽ
    0.19
     ultimate
    0.19
    Act Density 0.154%

    No Known Activations