INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    hasil
    -0.07
     mix
    -0.07
    imagin
    -0.06
     Nothing
    -0.06
      	
    -0.06
    However
    -0.06
    Maybe
    -0.06
    nothing
    -0.06
     nestled
    -0.06
    ывал
    -0.06
    POSITIVE LOGITS
     front
    0.12
     Front
    0.11
    Front
    0.10
     FRONT
    0.09
    онт
    0.08
    _front
    0.08
    /head
    0.08
     مقدم
    0.08
     North
    0.08
    front
    0.07
    Act Density 0.023%

    No Known Activations