INDEX
    Explanations

    references to written texts, documents, or agreements

    references to specific texts and their descriptions

    New Auto-Interp
    Negative Logits
    pload
    -0.80
     Sniper
    -0.78
    ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
    -0.69
    alty
    -0.69
     Ern
    -0.69
    ño
    -0.67
    Accessory
    -0.66
    rolet
    -0.64
    Parents
    -0.62
    Äĩ
    -0.62
    POSITIVE LOGITS
     texts
    1.02
    ured
    1.01
    uality
    0.97
     text
    0.93
    urally
    0.92
    book
    0.91
    books
    0.90
    ural
    0.89
     messaging
    0.87
     messages
    0.85
    Act Density 0.013%

    No Known Activations