INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    できる
    0.64
    SpecID
    0.64
     solltest
    0.62
     لیں
    0.59
     timelapse
    0.58
    FUNC
    0.57
    ूज
    0.56
     specif
    0.55
     italics
    0.55
     Bhav
    0.54
    POSITIVE LOGITS
    icano
    0.64
    ísimo
    0.56
     чувство
    0.55
    čení
    0.54
    gesellschaft
    0.52
    做事
    0.51
    ina
    0.51
     người
    0.50
    比如說
    0.50
    gebäude
    0.50
    Act Density 0.023%

    No Known Activations