INDEX
    Explanations

    specific actions or sequences

    New Auto-Interp
    Negative Logits
     потому
    0.49
    believe
    0.49
     believe
    0.46
     BECAUSE
    0.46
     because
    0.44
    because
    0.44
    かもしれません
    0.44
    porque
    0.43
    বিশ্বাস
    0.42
    LAMP
    0.42
    POSITIVE LOGITS
     bitOp
    0.53
     czynności
    0.53
     brochure
    0.48
     locomotion
    0.47
     grupy
    0.47
     stationery
    0.47
     regler
    0.46
     bookcase
    0.46
     규칙
    0.46
     rhythm
    0.45
    Act Density 0.002%

    No Known Activations