INDEX
    Explanations

    questions expressing curiosity or seeking information about the nature or type of something

    phrases relating to types or categories

    New Auto-Interp
    Negative Logits
    eni
    -0.77
    orest
    -0.76
    ĸļ
    -0.73
    enes
    -0.72
    esty
    -0.71
    pak
    -0.70
    cius
    -0.70
    arest
    -0.69
     Pigs
    -0.69
    ences
    -0.67
    POSITIVE LOGITS
     thing
    0.78
     relationship
    0.72
     luck
    0.72
     manners
    0.68
     surprises
    0.68
     millenn
    0.68
     monster
    0.67
     deal
    0.66
     goodies
    0.65
     sleeper
    0.64
    Act Density 0.043%

    No Known Activations