INDEX
    Explanations

    statements that question or critique methodological choices or interpretations in scientific studies

    New Auto-Interp
    Negative Logits
     myſelf
    -0.80
     itſelf
    -0.74
    SPATH
    -0.72
    ELTS
    -0.71
    ſelf
    -0.70
    WebServlet
    -0.70
     internetowa
    -0.70
     ligiloj
    -0.70
     pleaſure
    -0.70
     Jefus
    -0.68
    POSITIVE LOGITS
    ...
    0.54
     ...
    0.52
    0.51
     (
    0.50
     "
    0.46
     '
    0.45
    ?
    0.44
     saying
    0.44
     [
    0.43
    5
    0.43
    Act Density 0.066%

    No Known Activations