INDEX
Explanations
words related to causality or conclusions
references to the subject 'it' and its consequences or characteristics
New Auto-Interp
Negative Logits
jri
-0.77
į
-0.66
amaz
-0.66
itive
-0.65
stellar
-0.65
eworthy
-0.64
Atkinson
-0.63
west
-0.63
piring
-0.63
Stern
-0.61
POSITIVE LOGITS
'll
0.99
ought
0.93
're
0.91
've
0.90
'd
0.89
cannot
0.88
could
0.87
shouldn
0.86
should
0.85
might
0.84
Activations Density 0.301%