INDEX
    Explanations

    references to specific historical or cultural contexts and their implications

    New Auto-Interp
    Negative Logits
    !".
    -0.66
     torpedo
    -0.64
     Rhodes
    -0.64
     Newark
    -0.62
    $.
    -0.62
     yacht
    -0.62
    .")
    -0.60
     ."
    -0.59
     Jonah
    -0.58
    '.
    -0.56
    POSITIVE LOGITS
    âĢ
    1.72
     âĢ
    1.40
    âĢł
    1.21
     âĶ
    1.11
    âī
    0.96
    â
    0.95
    ãĢ
    0.93
     �
    0.90
    âģ
    0.89
     Â
    0.88
    Act Density 0.766%

    No Known Activations