INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    ertia
    -0.18
    keit
    -0.17
    oui
    -0.16
    ayne
    -0.14
    urname
    -0.14
    soever
    -0.14
    ayed
    -0.14
    -Isl
    -0.14
    -Muslim
    -0.14
    beit
    -0.14
    POSITIVE LOGITS
     Rim
    0.15
    exter
    0.14
     whenever
    0.14
    angu
    0.13
    ëĦ·
    0.13
    pest
    0.13
    rozen
    0.13
     Nicol
    0.13
    LEM
    0.13
    ume
    0.13
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.