INDEX
    Explanations

    verbs suggesting action or engagement

    New Auto-Interp
    Negative Logits
     Wikimedijinoj
    -0.77
    ^(@)
    -0.68
     "..\..\..\
    -0.67
    ^(@
    -0.66
    MATIC
    -0.63
     CURIAM
    -0.63
     $_"
    -0.62
    felves
    -0.62
     @}
    -0.62
    ORIES
    -0.61
    POSITIVE LOGITS
     the
    1.57
    the
    0.61
     barnen
    0.61
    BufferException
    0.59
     själva
    0.58
     igjen
    0.58
    <bos>
    0.54
     presentazione
    0.53
     kvinna
    0.53
     sitten
    0.53
    Act Density 1.307%

    No Known Activations