INDEX
    Explanations

    instances of argumentation and reasoning

    New Auto-Interp
    Negative Logits
    anke
    -0.15
    icap
    -0.15
    lus
    -0.15
    жд
    -0.15
    ellan
    -0.14
    mute
    -0.14
    éo
    -0.14
    ouv
    -0.14
    Bulk
    -0.14
    axter
    -0.13
    POSITIVE LOGITS
     briefly
    0.17
     lets
    0.16
    oment
    0.16
    åIJ§
    0.16
    shall
    0.15
     Scre
    0.15
     hypoth
    0.15
    again
    0.15
    ramer
    0.15
     Lets
    0.15
    Act Density 0.174%

    No Known Activations