INDEX
    Explanations

    locations mentioned in text

    New Auto-Interp
    Negative Logits
    ãĤ¡
    -0.74
     Bulg
    -0.73
     Isles
    -0.70
     friction
    -0.69
     Hera
    -0.66
    terday
    -0.64
     fortune
    -0.62
     Lerner
    -0.62
     Pose
    -0.61
     slurs
    -0.61
    POSITIVE LOGITS
    etermin
    1.29
    etermination
    1.28
    ownt
    1.28
    aughters
    1.19
    etermined
    1.17
    wayne
    1.16
    imensional
    1.15
    REAM
    1.14
    izzy
    1.14
    azz
    1.13
    Act Density 0.037%

    No Known Activations