INDEX
    Explanations

    logical arguments or reasoning in discussions

    New Auto-Interp
    Negative Logits
    Spoljašnje
    -1.05
     Paglinawan
    -1.04
     Roskov
    -1.02
    Datuak
    -1.00
    Portail
    -0.98
    зулта
    -0.95
     Италијани
    -0.94
    tanleria
    -0.94
     gainera
    -0.91
    ^(@)
    -0.89
    POSITIVE LOGITS
    0.52
    P
    0.50
      
    0.49
    T
    0.48
    I
    0.48
        
    0.47
    x
    0.47
    ?
    0.47
    !
    0.46
    d
    0.45
    Act Density 0.863%

    No Known Activations