INDEX
Explanations
informational content
phrases indicating personal experiences and time spent
New Auto-Interp
Negative Logits
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
-0.59
Mankind
-0.53
enery
-0.51
iage
-0.51
thood
-0.51
Built
-0.50
é£
-0.49
Canaveral
-0.49
phys
-0.48
ãĥ¥
-0.47
POSITIVE LOGITS
endix
0.62
ascript
0.62
reader
0.62
summar
0.62
summarize
0.61
partisan
0.59
rhetorical
0.57
Editorial
0.57
analytic
0.57
summarizes
0.56
Activations Density 2.830%