INDEX
    Explanations

    phrases related to effort or dedication

    phrases related to effort and time investment

    New Auto-Interp
    Negative Logits
     helicop
    -0.78
    ittle
    -0.74
     agre
    -0.73
    \\\\\\\\
    -0.72
    idden
    -0.72
    vertisement
    -0.70
     livest
    -0.69
    etheless
    -0.69
     convol
    -0.68
    reditary
    -0.67
    POSITIVE LOGITS
     oneself
    0.64
     to
    0.64
    ANA
    0.64
    ï¸ı
    0.61
     him
    0.61
     entails
    0.60
     for
    0.59
    .''.
    0.59
    GW
    0.58
     advantage
    0.58
    Act Density 0.042%

    No Known Activations