INDEX
    Explanations

    phrases related to qualities or features of things

    phrases emphasizing the existence or presence of certain problems or characteristics

    New Auto-Interp
    Negative Logits
    ²¾
    -0.90
    opian
    -0.78
    arest
    -0.76
    anches
    -0.74
    iddles
    -0.72
    eli
    -0.72
    å§«
    -0.72
    isms
    -0.71
     Roads
    -0.71
    enders
    -0.71
    POSITIVE LOGITS
     kind
    1.47
     type
    1.40
     sort
    1.34
     kinds
    1.02
     capability
    1.02
     exact
    0.99
     happen
    0.98
     trope
    0.97
     tactic
    0.97
     happening
    0.96
    Act Density 0.188%

    No Known Activations