INDEX
    Explanations

    references to the concept of "world."

    New Auto-Interp
    Negative Logits
     Rinse
    -0.49
     Hackney
    -0.48
     Lazy
    -0.47
    IntoConstraints
    -0.45
     preventive
    -0.45
    taminophen
    -0.45
     Preventive
    -0.44
    Insee
    -0.44
     meagre
    -0.44
    atalytic
    -0.44
    POSITIVE LOGITS
     world
    1.88
    world
    1.70
    World
    1.56
     WORLD
    1.52
     World
    1.50
    WORLD
    1.47
     wereld
    1.34
    世界
    1.31
     worlds
    1.30
     mundo
    1.28
    Act Density 0.023%

    No Known Activations