INDEX
Explanations
descriptive actions and interactions among different parties within a story or news event
New Auto-Interp
Negative Logits
Mages
-0.59
unemploy
-0.57
Sorceress
-0.57
Oracle
-0.54
.",
-0.52
contributor
-0.52
retirees
-0.51
attackers
-0.51
Rolls
-0.51
nailed
-0.50
POSITIVE LOGITS
appropriately
0.95
earnest
0.87
vain
0.80
clus
0.80
humane
0.78
versely
0.76
clusively
0.74
odes
0.72
aus
0.71
orously
0.69
Activations Density 19.581%