INDEX
Explanations
references to personal growth and transformative experiences
New Auto-Interp
Negative Logits
atcher
-0.15
596
-0.15
.generated
-0.14
cestor
-0.13
706
-0.13
atu
-0.13
ify
-0.13
709
-0.12
uggage
-0.12
íĥ
-0.12
POSITIVE LOGITS
takes
0.45
take
0.45
taking
0.42
bring
0.42
take
0.41
brings
0.40
lead
0.40
took
0.39
takes
0.39
Take
0.39
Activations Density 0.206%