INDEX
    Explanations

    adjectives related to being twisted or distorted

    instances of the words "twisted," "warped," and related terms indicating distortion

    New Auto-Interp
    Negative Logits
    alty
    -0.86
    abet
    -0.85
    ciation
    -0.84
    ¯¯¯¯
    -0.76
    ILA
    -0.74
    worthiness
    -0.73
    cial
    -0.72
    igraph
    -0.71
    cially
    -0.71
    gat
    -0.70
    POSITIVE LOGITS
     twisted
    1.05
     twist
    0.94
     twisting
    0.93
     twists
    0.81
     Twisted
    0.77
     adolesc
    0.76
     Hollow
    0.74
     lengths
    0.73
     imagin
    0.70
     intertw
    0.69
    Act Density 0.020%

    No Known Activations