INDEX
    Explanations

    verbs related to physical actions involving throwing or moving oneself forcefully

    phrases involving self-immersion or self-sacrifice

    New Auto-Interp
    Negative Logits
    ÃŁ
    -0.78
    Marginal
    -0.72
    SET
    -0.69
    Developer
    -0.67
    Administ
    -0.65
    APH
    -0.64
    ND
    -0.64
    QUI
    -0.63
    QU
    -0.63
    achine
    -0.63
    POSITIVE LOGITS
     overboard
    1.12
     tant
    0.85
     grenades
    0.84
     towel
    0.76
     grenade
    0.74
     torch
    0.66
     punches
    0.66
     insults
    0.64
     insult
    0.64
    lor
    0.64
    Act Density 0.104%

    No Known Activations