INDEX
Explanations
negative impacts or consequences associated with various actions or conditions
New Auto-Interp
Negative Logits
Well
-0.42
task
-0.41
Well
-0.40
↵↵
-0.39
եկ
-0.38
iel
-0.38
<eos>
-0.38
doubling
-0.37
prepared
-0.36
ുകൾ
-0.36
POSITIVE LOGITS
CloseOperation
0.99
enumi
0.95
مشين
0.94
kaarangay
0.93
wireType
0.92
AddTagHelper
0.91
незавершена
0.90
disambiguazione
0.90
TagMode
0.90
tagHelperRunner
0.88
Activations Density 0.750%