INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     Tanks
    -0.69
     Tank
    -0.68
     Disp
    -0.68
     Paradise
    -0.65
     pav
    -0.65
     Cathedral
    -0.64
    iche
    -0.64
     Battalion
    -0.63
     Hast
    -0.62
    idia
    -0.62
    POSITIVE LOGITS
    regular
    0.82
    pees
    0.76
    LU
    0.75
    vin
    0.72
    virt
    0.72
    alias
    0.70
    rots
    0.70
    keyes
    0.70
    -------
    0.69
    lu
    0.67
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.