INDEX
    Explanations

    animals, family, connection

    New Auto-Interp
    Negative Logits
     
    1.00
    est
    0.91
    em
    0.87
    iri
    0.86
     sehr
    0.85
    ute
    0.85
    ungs
    0.82
    	
    0.80
    irth
    0.80
     '
    0.80
    POSITIVE LOGITS
    <unused547>
    1.50
    т
    1.46
    <unused1056>
    1.46
    <unused1994>
    1.44
    <unused960>
    1.43
    <unused99>
    1.42
    <unused1218>
    1.41
    <unused2081>
    1.39
    <unused302>
    1.39
    <unused297>
    1.39
    Act Density 0.001%

    No Known Activations