INDEX
Explanations
technical specifications and concepts
New Auto-Interp
Negative Logits
Arab
0.39
engk
0.38
libr
0.38
ミン
0.37
ámica
0.37
Conv
0.36
timevals
0.36
†
0.36
min
0.36
Scal
0.36
POSITIVE LOGITS
rade
0.43
DF
0.42
fetish
0.41
ドン
0.41
piers
0.41
miracle
0.39
Miracle
0.39
доо
0.38
hero
0.38
kuj
0.38
Activations Density 0.001%