INDEX
Explanations
assertions or statements deemed to be explicit or unquestionable
instances of the word "clearly" indicating definitive statements or evidence
New Auto-Interp
Negative Logits
uese
-0.77
aily
-0.71
umption
-0.71
ucky
-0.69
lé
-0.69
oleon
-0.69
urch
-0.68
anish
-0.68
rost
-0.67
alez
-0.66
POSITIVE LOGITS
deline
0.88
identifiable
0.82
differentiated
0.82
spelled
0.78
distinguish
0.77
Effective
0.75
marked
0.75
outweigh
0.75
ACTED
0.74
readable
0.73
Activations Density 0.023%