INDEX
    Explanations

    created by the Gemma team

    New Auto-Interp
    Negative Logits
    Myers
    0.48
     Ap
    0.47
     Myers
    0.46
     ap
    0.46
    Ap
    0.45
     AP
    0.44
     Raj
    0.43
    रेज
    0.42
    Raj
    0.42
    AP
    0.41
    POSITIVE LOGITS
    ம்ம
    0.39
    0.38
    lop
    0.35
     Nervous
    0.35
    hu
    0.35
    gF
    0.35
    0.34
     juniper
    0.34
     Loyalty
    0.33
    ionage
    0.32
    Act Density 0.016%

    No Known Activations