INDEX
    Explanations

    inspired/prompted

    New Auto-Interp
    Negative Logits
     inspired
    -1.58
     Inspired
    -1.31
    inspired
    -1.27
     prompted
    -1.24
    Inspired
    -1.23
     motivated
    -1.17
     attracted
    -1.10
     spurred
    -1.09
     متعلقه
    -1.05
     inspiré
    -1.01
    POSITIVE LOGITS
     by
    0.79
     to
    0.71
     out
    0.61
     off
    0.61
    ly
    0.60
     down
    0.60
     past
    0.54
     them
    0.53
     forth
    0.53
     on
    0.51
    Act Density 0.038%

    No Known Activations