INDEX
Explanations
references to the number three
New Auto-Interp
Negative Logits
للاسماء
-0.93
المعيارى
-0.93
queſta
-0.87
Walkover
-0.80
beſch
-0.77
<unused43>
-0.76
<pad>
-0.75
<unused41>
-0.75
batore
-0.75
<unused17>
-0.74
POSITIVE LOGITS
two
0.73
three
0.71
Three
0.69
0.68
one
0.67
a
0.61
the
0.60
The
0.57
U
0.55
five
0.55
Activations Density 0.364%