INDEX
    Explanations

    introduces an explanation

    New Auto-Interp
    Negative Logits
     Geometry
    0.26
     BasicContainer
    0.25
     becomes
    0.24
     becoming
    0.24
     occupying
    0.24
     fulfilling
    0.24
     выполнять
    0.23
     становится
    0.23
     ક્ર
    0.23
     interactivity
    0.23
    POSITIVE LOGITS
     mengatakan
    0.49
     says
    0.42
     recommends
    0.40
     believes
    0.39
    认为
    0.38
    認為
    0.37
     считают
    0.36
     advises
    0.36
     suggests
    0.36
     explains
    0.36
    Act Density 5.236%

    No Known Activations