INDEX
    Explanations

    the word "kind" with a high level of activation

    phrases referring to different categories or types of things

    New Auto-Interp
    Negative Logits
     VIDEOS
    -0.79
    UNCH
    -0.74
    mercial
    -0.70
    å§«
    -0.70
    oulos
    -0.68
    è¦ļéĨĴ
    -0.68
    eor
    -0.67
    edia
    -0.66
     Minutes
    -0.66
    borough
    -0.65
    POSITIVE LOGITS
    lier
    0.89
    hearted
    0.84
    liest
    0.82
    liness
    0.78
    ifier
    0.75
    ling
    0.74
     gesture
    0.72
    ilege
    0.72
     prevail
    0.68
    nered
    0.66
    Act Density 0.031%

    No Known Activations