INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     digestion
    -0.74
    peat
    -0.70
     bandwagon
    -0.68
    meat
    -0.64
     machines
    -0.62
     licenses
    -0.62
     expansions
    -0.61
     hiber
    -0.60
    ktop
    -0.60
    hare
    -0.59
    POSITIVE LOGITS
    oshop
    0.84
    uran
    0.74
    ndra
    0.74
    izophren
    0.69
    ̶
    0.69
    erman
    0.68
     :=
    0.68
    oman
    0.67
    auder
    0.67
    isconsin
    0.66
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.