INDEX
    Explanations

    references to "Star Wars" in various contexts

    New Auto-Interp
    Negative Logits
    elho
    -0.18
    enne
    -0.17
    erson
    -0.17
    kaz
    -0.17
    ymous
    -0.17
    yses
    -0.17
    go
    -0.16
    estro
    -0.15
    abler
    -0.15
    گاÙĩ
    -0.15
    POSITIVE LOGITS
     Wars
    0.35
     Trek
    0.29
     wars
    0.28
    Wars
    0.28
    ship
    0.25
    ry
    0.22
    vation
    0.21
     trek
    0.21
    ships
    0.20
    ategy
    0.20
    Act Density 0.010%

    No Known Activations