INDEX
Explanations
numerical values that appear to be quantities or measurements
New Auto-Interp
Negative Logits
swick
-0.79
lycer
-0.71
xual
-0.70
ophon
-0.68
worldly
-0.68
cles
-0.65
sed
-0.64
urger
-0.64
quit
-0.62
ppo
-0.62
POSITIVE LOGITS
Thirty
1.21
ILCS
0.98
678
0.76
gallon
0.74
th
0.71
iam
0.71
anging
0.71
ushima
0.70
âĺħ
0.69
%-
0.69
Activations Density 0.808%