INDEX
    Explanations

    mentions of reading or books

    New Auto-Interp
    Negative Logits
    xon
    -0.67
    ño
    -0.64
    Enlarge
    -0.62
    lla
    -0.61
    ortality
    -0.60
    ascal
    -0.59
    ctions
    -0.59
    cker
    -0.59
    trl
    -0.58
    ality
    -0.58
    POSITIVE LOGITS
     aloud
    1.49
    just
    1.15
     comprehension
    1.09
    dress
    0.98
     excerpts
    0.96
     books
    0.94
    write
    0.89
     texts
    0.88
    mitt
    0.88
    mill
    0.86
    Act Density 0.040%

    No Known Activations