INDEX
    Explanations

    references to figures or specific sections within a document

    references to figures or diagrams

    New Auto-Interp
    Negative Logits
    pron
    -0.78
    izabeth
    -0.75
    acia
    -0.72
    umenthal
    -0.70
    eworld
    -0.69
    ances
    -0.68
    resist
    -0.68
    nee
    -0.68
     homeland
    -0.67
    itch
    -0.65
    POSITIVE LOGITS
    Figure
    1.05
     Figure
    0.86
    Ī
    0.80
    book
    0.75
     Explorer
    0.75
    Fig
    0.74
    figure
    0.74
     Drawing
    0.72
    Table
    0.71
     Sheet
    0.71
    Act Density 0.019%

    No Known Activations