INDEX
    Explanations

    punctuation and formatting markers in the text

    New Auto-Interp
    Negative Logits
    oger
    -0.15
    kat
    -0.15
     curves
    -0.14
    auss
    -0.14
    онÑĮ
    -0.14
     communities
    -0.14
     curve
    -0.14
     bottle
    -0.14
    ography
    -0.14
    ο
    -0.13
    POSITIVE LOGITS
    æĪ
    0.15
    ARIO
    0.15
    JNI
    0.15
    charm
    0.15
    éϵ
    0.14
    heimer
    0.14
    otoxic
    0.14
    .tie
    0.14
    درÛĮ
    0.14
    ãģ¨ãģĨ
    0.14
    Act Density 0.023%

    No Known Activations