INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     armor
    -0.17
     probe
    -0.15
    ige
    -0.15
     Keto
    -0.15
    plate
    -0.15
    armor
    -0.15
     probes
    -0.14
    ³
    -0.14
     armored
    -0.14
     harbor
    -0.14
    POSITIVE LOGITS
     Iran
    0.29
     Iranian
    0.28
     Tehran
    0.26
     Iranians
    0.25
    iran
    0.24
    Iran
    0.24
     gays
    0.22
     gay
    0.22
     western
    0.20
     İran
    0.20
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.