INDEX
Explanations
references to quantifiable metrics and data in various contexts
New Auto-Interp
Negative Logits
ainfi
-0.87
PreferredItem
-0.80
مشين
-0.80
myſelf
-0.78
TintMode
-0.77
Dili
-0.76
Diretto
-0.75
Puis
-0.74
therefrom
-0.74
(\<
-0.74
POSITIVE LOGITS
0.76
has
0.61
or
0.59
had
0.55
A
0.53
have
0.52
Has
0.52
and
0.50
punya
0.49
very
0.48
Activations Density 0.707%