INDEX
Explanations
numbers that appear to have significance or importance
New Auto-Interp
Negative Logits
awaru
-0.73
arial
-0.64
iguous
-0.63
ilic
-0.62
icularly
-0.62
romeda
-0.62
riel
-0.61
ificial
-0.60
rolet
-0.60
opping
-0.60
POSITIVE LOGITS
Ħ¢
0.68
tm
0.68
%,
0.67
XL
0.66
+,
0.65
ÃĹ
0.65
nd
0.65
½
0.64
rd
0.63
Intermediate
0.60
Activations Density 12.699%