INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     inconsistency
    -0.13
     sad
    -0.10
     pur
    -0.10
     inconsist
    -0.10
    åħį
    -0.10
     inconsistencies
    -0.09
     incompetence
    -0.09
    arov
    -0.09
     situation
    -0.09
     prompt
    -0.09
    POSITIVE LOGITS
     lack
    0.28
     limited
    0.24
     difficulty
    0.21
    limited
    0.20
     poor
    0.20
     absence
    0.19
     inability
    0.19
     falta
    0.19
     Lack
    0.18
    æľīéĻIJ
    0.18
    Act Density 0.110%

    No Known Activations