INDEX
Explanations
statements that challenge prevailing truths or assumptions
New Auto-Interp
Negative Logits
propOrder
-0.65
FORUM
-0.46
Astrophysical
-0.44
untos
-0.44
forder
-0.43
graph
-0.41
ColumnHeaders
-0.40
Graph
-0.40
ress
-0.40
mains
-0.40
POSITIVE LOGITS
ⓧ
0.72
httphttps
0.69
ыгана
0.69
makeText
0.66
sauvages
0.65
שוליים
0.64
AxisAlignment
0.63
voyez
0.63
assoluto
0.63
nakalista
0.61
Activations Density 0.277%