INDEX
Explanations
New Auto-Interp
Negative Logits
'
-1.17
2
-0.98
3
-0.97
6
-0.96
7
-0.94
’
-0.90
5
-0.90
4
-0.90
concor
-0.89
0
-0.89
POSITIVE LOGITS
ⓧ
1.83
propOrder
1.76
betweenstory
1.73
)");
1.69
")));
1.67
дописавши
1.66
]")]
1.66
"]);
1.63
'\\;'
1.63
})*/
1.59
Activations Density 21.996%