INDEX
    Explanations

    negations or limitations in the text

    New Auto-Interp
    Negative Logits
     sharp
    -0.19
    leigh
    -0.17
     youngest
    -0.16
    sharp
    -0.16
     weakest
    -0.15
     Sharp
    -0.14
     smallest
    -0.14
    Sharp
    -0.14
     fairly
    -0.14
    845
    -0.14
    POSITIVE LOGITS
     more
    0.60
    æĽ´å¤ļ
    0.52
     MORE
    0.49
    more
    0.47
     greater
    0.47
     lebih
    0.46
     болÑĮÑĪе
    0.44
    -more
    0.42
     daha
    0.42
     wiÄĻcej
    0.42
    Act Density 0.003%

    No Known Activations