INDEX
Explanations
concepts related to self-identity and awareness
New Auto-Interp
Negative Logits
themselves
-0.83
ourselves
-0.72
himself
-0.66
yourself
-0.66
itself
-0.65
myself
-0.59
herself
-0.58
PMID
-0.58
OGND
-0.58
RSSSF
-0.58
POSITIVE LOGITS
standig
0.67
ändig
0.65
SELF
0.63
s
0.62
hood
0.59
ly
0.58
ständig
0.56
帖最后由
0.55
ishly
0.54
IInterface
0.54
Activations Density 0.143%