INDEX
Explanations
patterns of repetition and connections between ideas in a narrative
New Auto-Interp
Negative Logits
urf
-0.17
esk
-0.16
ored
-0.16
ÏĦία
-0.15
ë§İìĿ´
-0.14
reon
-0.14
isty
-0.13
usk
-0.13
oder
-0.13
ached
-0.13
POSITIVE LOGITS
vo
0.46
guess
0.41
sure
0.38
Vo
0.37
guess
0.34
vo
0.34
Sure
0.33
lo
0.33
prest
0.33
Guess
0.33
Activations Density 0.275%