INDEX
Explanations
phrases relating to evaluations or opinions about individuals or entities
references to rankings and evaluations of individuals or entities
New Auto-Interp
Head Attr Weights
0:0.06
1:0.02
2:0.05
3:0.36
4:0.04
5:0.09
6:0.04
7:0.06
8:0.07
9:0.03
10:0.08
11:0.05
Negative Logits
"
-2.01
"[
-1.99
"-
-1.72
".
-1.62
weekday
-1.60
specialty
-1.60
/"
-1.59
"/
-1.56
typically
-1.55
"+
-1.54
POSITIVE LOGITS
':
4.11
',"
3.22
,'
3.13
?'
2.98
,'"
2.97
','
2.96
!'
2.90
.'
2.69
['
2.68
:'
2.63
Activations Density 0.004%