INDEX
Explanations
phrases or questions that inquire about types or categories of things
New Auto-Interp
Negative Logits
illustrazione
-0.63
aérea
-0.63
évêque
-0.63
démission
-0.62
frequenza
-0.58
preghiera
-0.57
nervioso
-0.57
biologie
-0.57
lettura
-0.57
menikah
-0.57
POSITIVE LOGITS
"):
0.95
".
0.92
mergeFrom
0.83
...',
0.83
)";
0.82
...",
0.81
❞
0.81
":
0.80
;'>
0.79
\"",
0.78
Activations Density 0.103%