INDEX
    Explanations

    conjunctions and transitional phrases that indicate the reasoning or conclusion in a text

    New Auto-Interp
    Negative Logits
    ly
    -0.87
    NameInMap
    -0.76
     Infórmanos
    -0.75
    🔥🔥
    -0.75
    -0.72
     —
    -0.70
    ので
    -0.69
    erweise
    -0.69
    halb
    -0.68
    ———
    -0.65
    POSITIVE LOGITS
    ––––
    1.32
    1.04
     ​​
    1.02
    ര്‍
    0.99
     –
    0.97
    ്‍
    0.95
    ţi
    0.95
    ায়
    0.94
    ––
    0.93
    ţilor
    0.93
    Act Density 0.499%

    No Known Activations