INDEX
Explanations
proper names of individuals or entities
proper nouns, particularly names of people and places
New Auto-Interp
Negative Logits
..."
-0.77
Americ
-0.72
1500
-0.67
Wes
-0.67
Scar
-0.66
ãĤ¡
-0.62
Wyr
-0.61
0200
-0.60
Chero
-0.59
åĤ
-0.58
POSITIVE LOGITS
meanwhile
1.04
responded
1.03
denies
0.98
countered
0.93
apologized
0.93
reportedly
0.93
vetoed
0.90
reacted
0.88
thanked
0.87
replied
0.86
Activations Density 0.392%