INDEX
    Explanations

    phrases that indicate purpose, goals, or intentions regarding actions

    New Auto-Interp
    Negative Logits
    gbaar
    -0.54
    ftagPool
    -0.54
    stücks
    -0.53
    生平
    -0.53
     disambiguazione
    -0.52
    rekte
    -0.52
     mirth
    -0.52
    さまで
    -0.52
    Contrast
    -0.51
    TypeOf
    -0.51
    POSITIVE LOGITS
     aim
    0.86
    เพื่อ
    0.82
     aimed
    0.82
     nhằm
    0.82
     بهد
    0.81
    為了
    0.78
    为了
    0.74
     aims
    0.74
     bedoeld
    0.73
     bertujuan
    0.73
    Act Density 0.425%

    No Known Activations