INDEX
Explanations
references to specific metrics or comparisons in technical contexts
New Auto-Interp
Negative Logits
findpost
-1.01
Hochspringen
-0.87
UserScript
-0.86
MLLoader
-0.83
gameserver
-0.81
клопе
-0.80
contentLoaded
-0.79
ainfi
-0.77
Spoljašnje
-0.77
RIPRODUZIONE
-0.74
POSITIVE LOGITS
I
0.55
…
0.52
,
0.50
.
0.49
K
0.47
his
0.47
O
0.47
F
0.46
–
0.46
Don
0.45
Activations Density 0.545%