INDEX
Explanations
acronyms and numerical values that are repeated in a sequence
New Auto-Interp
Negative Logits
blur
-0.85
ric
-0.83
gent
-0.79
plent
-0.79
sid
-0.77
ancest
-0.74
adam
-0.73
sher
-0.71
sausage
-0.71
princ
-0.71
POSITIVE LOGITS
KA
1.21
PT
1.17
ING
1.17
AX
1.17
RD
1.15
TED
1.14
ROR
1.13
DERR
1.13
PAC
1.11
OUT
1.11
Activations Density 0.065%