INDEX
Explanations
instances where the text expresses strong opinions or evaluations about something
conditional statements or phrases that suggest scenario-based reasoning
New Auto-Interp
Negative Logits
ãĤª
-0.75
roid
-0.64
rosse
-0.62
%);
-0.62
FTWARE
-0.61
akia
-0.60
GOODMAN
-0.59
heimer
-0.59
holm
-0.58
yss
-0.58
POSITIVE LOGITS
fy
0.87
tar
0.77
edi
0.73
unchecked
0.69
you
0.67
rame
0.67
anything
0.66
acebook
0.65
iably
0.64
FIN
0.62
Activations Density 0.075%