INDEX
    Explanations

    Dear followed by a name or title

    New Auto-Interp
    Negative Logits
     aulas
    0.81
     هیڅ
    0.80
    rils
    0.80
    undos
    0.79
     ores
    0.78
     alemão
    0.76
     ګرځ
    0.76
     comprimento
    0.75
     sauerkraut
    0.75
     estação
    0.74
    POSITIVE LOGITS
    𝑖
    0.74
    Battery
    0.73
    сні
    0.72
    ્ટ
    0.71
    0.71
    "],
    0.70
    Colors
    0.69
    Trajectory
    0.69
    િં
    0.69
     Attribute
    0.68
    Act Density 0.003%

    No Known Activations