INDEX
Explanations
references to specific historical events or figures
New Auto-Interp
Negative Logits
Astor
-0.84
WithTag
-0.83
Hait
-0.82
labelControl
-0.82
Cama
-0.79
Keras
-0.79
Muro
-0.78
Shakspeare
-0.76
openConnection
-0.75
Shuk
-0.75
POSITIVE LOGITS
"./
0.78
⎯
0.77
лоди
0.77
Magnus
0.77
Logger
0.75
Logger
0.72
})]
0.72
Tobias
0.72
Andersson
0.72
Swal
0.71
Activations Density 3.296%