INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     below
    0.76
     Below
    0.76
    below
    0.69
    Below
    0.68
     ниже
    0.63
     above
    0.54
    以下
    0.53
    above
    0.53
     abaixo
    0.53
     BELOW
    0.53
    POSITIVE LOGITS
     Unterstützung
    0.49
    Support
    0.48
     support
    0.48
     ser
    0.47
     supporto
    0.45
     apoyo
    0.44
     підтрим
    0.43
     Support
    0.43
     soporte
    0.43
     action
    0.40
    Act Density 0.004%

    No Known Activations