INDEX
    Explanations

    phrases indicating uncertainty or complexity in discussions, particularly around societal issues and diverse perspectives

    New Auto-Interp
    Negative Logits
    -0.54
     مرئيه
    -0.53
    ikistan
    -0.52
    Hochspringen
    -0.52
     Савезне
    -0.51
     szóci
    -0.51
     BoxFit
    -0.50
    ruptedException
    -0.50
    -0.49
    RegressionTest
    -0.49
    POSITIVE LOGITS
     -
    0.44
    exitRule
    0.43
     iprot
    0.40
     Karlsson
    0.40
     [
    0.39
    :
    0.39
     surla
    0.38
     correct
    0.37
    országban
    0.36
     VERY
    0.35
    Act Density 1.343%

    No Known Activations