INDEX
    Explanations

    references to the concept of science across various contexts

    New Auto-Interp
    Negative Logits
    tra
    -0.19
    neo
    -0.17
    INGS
    -0.17
    neath
    -0.17
    ted
    -0.16
    lands
    -0.16
    rias
    -0.16
    ter
    -0.15
    isma
    -0.15
    acher
    -0.15
    POSITIVE LOGITS
    -fiction
    0.36
     fiction
    0.34
     Fiction
    0.30
    /math
    0.27
    fiction
    0.25
     fictional
    0.24
    /engine
    0.24
    /art
    0.21
     fair
    0.19
    /Math
    0.19
    Act Density 0.036%

    No Known Activations