INDEX
Explanations
specific attributes and behaviors associated with characters or actors in various contexts
New Auto-Interp
Negative Logits
con
-0.47
-0.43
/
-0.42
or
-0.41
<eos>
-0.41
M
-0.40
,
-0.40
</
-0.40
Parameterized
-0.40
be
-0.40
POSITIVE LOGITS
myſelf
1.47
Monfieur
1.42
themſelves
1.35
itſelf
1.32
himſelf
1.29
ſtate
1.29
purpoſe
1.25
Jefus
1.22
Reſ
1.20
'\\;'
1.19
Activations Density 0.340%