INDEX
    Explanations

    categories and labels relating to various subjects

    New Auto-Interp
    Negative Logits
     فريبيس
    -0.68
    Diweddarwch
    -0.67
    ropoda
    -0.67
    Erreferentziak
    -0.67
    Glej
    -0.65
     referenties
    -0.62
    horabuena
    -0.62
     ProtoMessage
    -0.61
    InitVars
    -0.61
    Rüyada
    -0.61
    POSITIVE LOGITS
    cape
    0.53
     dry
    0.52
     Vikipedi
    0.50
    0.49
     Dry
    0.48
    0.47
     mad
    0.47
     very
    0.46
    sort
    0.46
    0.44
    Act Density 1.680%

    No Known Activations