INDEX
    Explanations

    phrases and structures related to logical reasoning and argumentation

    New Auto-Interp
    Negative Logits
    bjerg
    -0.15
    isors
    -0.15
    isor
    -0.14
    ple
    -0.14
    ention
    -0.14
    uplic
    -0.13
    akis
    -0.13
    Vs
    -0.13
    URT
    -0.13
    uku
    -0.13
    POSITIVE LOGITS
    olan
    0.17
    ży
    0.14
    795
    0.14
    deÅŁ
    0.14
    .removeFrom
    0.13
    ami
    0.13
     Wesley
    0.13
    Ù
    0.13
    oundary
    0.13
     Gy
    0.13
    Act Density 0.037%

    No Known Activations