INDEX
    Explanations

    can followed by actions

    New Auto-Interp
    Negative Logits
    0.51
     когда
    0.45
    0.44
    0.44
    0.44
    Easy
    0.43
    0.43
     отлич
    0.42
     mecánico
    0.42
     लहान
    0.42
    POSITIVE LOGITS
     pessimism
    0.59
    political
    0.57
     políticos
    0.54
     politik
    0.52
    politik
    0.49
     legitimacy
    0.49
     banal
    0.49
     subjug
    0.48
     politicians
    0.47
    議論
    0.47
    Act Density 0.003%

    No Known Activations