INDEX
Explanations
references to keys used in a programming context
New Auto-Interp
Negative Logits
al
-0.17
ationToken
-0.16
asure
-0.16
iate
-0.15
stal
-0.14
355
-0.14
mts
-0.14
alles
-0.14
heck
-0.14
apia
-0.14
POSITIVE LOGITS
hole
0.29
chain
0.26
chains
0.23
ring
0.23
phrase
0.22
logger
0.22
cloak
0.21
notes
0.21
note
0.21
frames
0.20
Activations Density 0.058%