INDEX
Explanations
terms related to warranties and fitness for a particular purpose
New Auto-Interp
Negative Logits
–
-0.54
flip
-0.51
</strong>
-0.48
hoga
-0.47
hup
-0.47
fo
-0.46
io
-0.46
(
-0.46
Change
-0.46
Story
-0.45
POSITIVE LOGITS
purpoſe
0.83
itſelf
0.82
uſe
0.79
perſon
0.79
iſt
0.77
pleaſure
0.76
ſelf
0.75
Majefty
0.75
DebuggerNonUser
0.74
himſelf
0.73
Activations Density 0.042%