INDEX
    Explanations

    research findings being presented

    New Auto-Interp
    Negative Logits
     according
    0.51
     According
    0.43
     حسب
    0.43
     Según
    0.41
    вле
    0.41
     Evaluation
    0.40
    Menurut
    0.37
     By
    0.37
     Jew
    0.37
    0.37
    POSITIVE LOGITS
     conclusively
    0.52
     rằng
    0.49
     convincingly
    0.43
     bahwa
    0.43
     considerable
    0.42
     ότι
    0.42
     oldukça
    0.41
     상당
    0.41
    incinnati
    0.40
     giảm
    0.40
    Act Density 0.048%

    No Known Activations