INDEX
Explanations
discussions surrounding logic and arguments
New Auto-Interp
Negative Logits
ubb
-0.18
.tf
-0.15
elper
-0.14
tent
-0.14
hq
-0.14
Bound
-0.14
jenter
-0.14
Jenner
-0.14
£½
-0.14
emento
-0.13
POSITIVE LOGITS
appeal
0.35
Appeal
0.34
appeals
0.32
fall
0.31
Appeals
0.29
appealed
0.26
Fall
0.25
Logical
0.24
logical
0.24
Fall
0.24
Activations Density 0.036%