INDEX
    Explanations

    verbs that indicate statements or claims made by individuals

    New Auto-Interp
    Negative Logits
    ubb
    -0.15
    infinity
    -0.14
    eg
    -0.14
     Hag
    -0.14
    ighth
    -0.14
     Vel
    -0.14
    fw
    -0.14
    avier
    -0.14
     infinity
    -0.14
    ød
    -0.14
    POSITIVE LOGITS
    ampo
    0.17
    ContentLoaded
    0.15
    lest
    0.15
    leck
    0.15
    frey
    0.15
    -NLS
    0.15
     mastur
    0.14
    hazi
    0.14
    uyla
    0.14
    ŃĶ
    0.14
    Act Density 0.064%

    No Known Activations