INDEX
    Explanations

    instances of comparisons or qualifiers regarding expectations and relationships

    New Auto-Interp
    Negative Logits
    áo
    -0.15
    arium
    -0.15
     latina
    -0.15
    ï¼Ĩ
    -0.13
    .wp
    -0.13
    variants
    -0.13
     McCarthy
    -0.12
     à¤ķà¤Ī
    -0.12
     Latina
    -0.12
    ominator
    -0.12
    POSITIVE LOGITS
     how
    0.21
     finances
    0.21
    timing
    0.17
     timing
    0.17
     politics
    0.17
     whether
    0.17
    287
    0.16
    how
    0.16
     myself
    0.16
     matters
    0.15
    Act Density 0.299%

    No Known Activations