INDEX
Explanations
names of people and places, as well as specific roles or titles related to those names
New Auto-Interp
Negative Logits
blanches
-0.59
AxisAlignment
-0.53
anteriore
-0.53
rek
-0.52
fisch
-0.49
бенок
-0.49
piac
-0.49
+:+
-0.49
mangle
-0.48
ál
-0.48
POSITIVE LOGITS
)";
0.81
"):
0.77
}")]
0.71
.";
0.70
>';
0.69
"]
0.67
"],
0.67
)}</
0.67
noDo
0.66
loài
0.65
Activations Density 0.738%