INDEX
    Explanations

    Past actions

    New Auto-Interp
    Negative Logits
     гор
    -0.07
     thrill
    -0.07
    เทคโนโลย
    -0.06
     پیام
    -0.06
     dispersion
    -0.06
     ethic
    -0.06
    <strong
    -0.06
    	java
    -0.06
    ่ม
    -0.06
    ErrorHandler
    -0.06
    POSITIVE LOGITS
     Arrange
    0.07
     Hind
    0.06
     infiltr
    0.06
     tape
    0.06
     Barber
    0.06
     Handles
    0.06
    >>;↵
    0.06
     predic
    0.06
    0.06
     Hok
    0.06
    Act Density 0.010%

    No Known Activations