INDEX
Explanations
references to measurements and numerical data
New Auto-Interp
Negative Logits
TagMode
-0.62
riwal
-0.60
hundreds
-0.58
two
-0.57
thousands
-0.55
архивлан
-0.55
SequentialGroup
-0.54
Manbalar
-0.54
two
-0.54
hundred
-0.53
POSITIVE LOGITS
nine
1.01
eight
0.98
ten
0.89
Eight
0.89
Nine
0.88
twelve
0.86
seven
0.84
Ten
0.84
eleven
0.84
NINE
0.81
Activations Density 0.919%