INDEX
Explanations
instances of factual statements and descriptions of specific details
New Auto-Interp
Negative Logits
/GPL
-0.17
enk
-0.16
sip
-0.15
_ASSUME
-0.15
rowable
-0.15
lsru
-0.14
мага
-0.14
coe
-0.14
reten
-0.14
objc
-0.14
POSITIVE LOGITS
brid
0.15
def
0.15
Alt
0.15
arpa
0.14
bet
0.14
uka
0.14
fore
0.14
deg
0.13
enta
0.13
uly
0.13
Activations Density 0.947%