INDEX
    Explanations

    references to mirrors and mirror-like properties or characteristics

    New Auto-Interp
    Negative Logits
     himo
    -0.81
     препратки
    -0.81
    Ged
    -0.69
     flav
    -0.68
    lahan
    -0.68
     Manfred
    -0.68
    Manfred
    -0.66
    Sqft
    -0.65
     Fot
    -0.65
     Wils
    -0.65
    POSITIVE LOGITS
     mirror
    1.42
     Mirrors
    1.38
     Mirror
    1.36
     mirrors
    1.23
    Mirrors
    1.23
     MIRROR
    1.20
    Mirror
    1.16
    mirror
    1.09
     Spiegel
    1.02
     miroir
    0.98
    Act Density 0.005%

    No Known Activations