INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    zanne
    -0.82
    bern
    -0.79
     thriving
    -0.76
     Foss
    -0.74
     Tanz
    -0.68
    steen
    -0.67
    aepernick
    -0.65
    wikipedia
    -0.65
    radical
    -0.65
    ternity
    -0.65
    POSITIVE LOGITS
    reads
    0.80
    ips
    0.73
     follows
    0.70
    oon
    0.70
    ously
    0.70
    icles
    0.68
    ence
    0.68
    ous
    0.67
    uin
    0.65
    ''.
    0.65
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.