INDEX
Explanations
statements that emphasize and point out specific information or situations
the word "that" in various contexts
New Auto-Interp
Negative Logits
emis
-0.80
YS
-0.70
oby
-0.69
LIN
-0.66
IVERS
-0.66
Tax
-0.66
tones
-0.64
thro
-0.64
hips
-0.64
apolis
-0.64
POSITIVE LOGITS
includes
0.89
culminated
0.86
translates
0.85
pesky
0.82
contradicts
0.81
resulted
0.80
mattered
0.79
cher
0.79
leads
0.76
amounted
0.75
Activations Density 0.112%