INDEX
Explanations
occurrences of the word "found" in relation to mortality
New Auto-Interp
Negative Logits
laun
-0.81
obser
-0.67
favor
-0.66
anguage
-0.64
adan
-0.62
edit
-0.62
uster
-0.60
widest
-0.60
aan
-0.60
Idea
-0.59
POSITIVE LOGITS
heading
0.84
keley
0.73
zon
0.70
rawdownloadcloneembedreportprint
0.69
spir
0.69
actionGroup
0.68
luaj
0.68
footed
0.67
dead
0.66
hanged
0.65
Activations Density 0.018%