INDEX
Explanations
words related to causation and attribution
phrases indicating causation or attribution
New Auto-Interp
Negative Logits
rooms
-0.68
Mush
-0.64
benches
-0.63
ahs
-0.61
erb
-0.61
iverpool
-0.61
headers
-0.59
needle
-0.58
cgi
-0.58
ahi
-0.58
POSITIVE LOGITS
partly
0.85
solely
0.82
chiefly
0.81
actionDate
0.77
principally
0.76
disproportionately
0.72
attributable
0.72
SourceFile
0.70
partially
0.69
galitarian
0.69
Activations Density 0.154%