INDEX
    Explanations

    visibility and seeing things

    New Auto-Interp
    Negative Logits
    वणी
    0.39
    0.39
    Allister
    0.38
    0.37
    askell
    0.37
    0.37
    נס
    0.37
    werten
    0.36
    жкой
    0.36
     လုပ်
    0.36
    POSITIVE LOGITS
     visibility
    1.98
     Visibility
    1.71
     visible
    1.68
    Visibility
    1.63
     zicht
    1.51
    visibility
    1.50
    visible
    1.49
     visibles
    1.46
     Visible
    1.42
    Visible
    1.39
    Act Density 0.065%

    No Known Activations