INDEX
Explanations
phrases related to emphasizing specific information or points
the word "that" in various contexts
New Auto-Interp
Negative Logits
apolis
-0.73
hips
-0.71
YS
-0.71
emis
-0.70
istics
-0.66
asures
-0.66
tones
-0.63
olitan
-0.62
yles
-0.62
IAS
-0.60
POSITIVE LOGITS
translates
0.93
includes
0.91
culminated
0.91
pesky
0.86
entails
0.82
resulted
0.82
extends
0.77
contradicts
0.77
culmin
0.77
mattered
0.76
Activations Density 0.106%