INDEX
    Explanations

    verbs related to urging or requesting action

    New Auto-Interp
    Negative Logits
     istg
    -0.73
    bilt
    -0.69
    oday
    -0.62
    ————
    -0.62
    olitics
    -0.61
    Laughs
    -0.59
    âĢ¢âĢ¢âĢ¢âĢ¢
    -0.58
    ynski
    -0.57
     monop
    -0.57
     bunny
    -0.57
    POSITIVE LOGITS
    backs
    1.04
     attention
    0.96
     forth
    0.94
    igraph
    0.92
    oused
    0.92
    ouses
    0.80
    phas
    0.80
     upon
    0.79
    ously
    0.78
    plates
    0.77
    Act Density 0.041%

    No Known Activations