INDEX
Explanations
words related to critical or important aspects of a situation
terms associated with prediction, assessment, and evaluation processes
New Auto-Interp
Negative Logits
loves
-0.69
believes
-0.67
hopes
-0.64
regrets
-0.64
laughs
-0.64
argues
-0.60
ãĥ³ãĤ¸
-0.60
Adds
-0.59
likes
-0.59
OV
-0.59
POSITIVE LOGITS
are
1.36
were
1.35
expire
1.34
arise
1.28
occur
1.26
exist
1.24
collide
1.20
aren
1.19
were
1.17
weren
1.17
Activations Density 0.308%