INDEX
    Explanations

    arguing or presenting arguments

    New Auto-Interp
    Negative Logits
     least
    0.42
    least
    0.39
     oon
    0.38
     difíciles
    0.34
     liber
    0.34
     tendencies
    0.34
     much
    0.34
     definitely
    0.34
    ゲット
    0.33
     mucho
    0.33
    POSITIVE LOGITS
     argument
    0.64
     аргу
    0.61
    argument
    0.58
     Argument
    0.52
     arguments
    0.52
    Argument
    0.52
     argumentation
    0.52
    arguments
    0.46
     якобы
    0.46
     arguing
    0.44
    Act Density 0.105%

    No Known Activations