INDEX
Explanations
instances of the word "was" and variations indicating past events or states
New Auto-Interp
Negative Logits
folks
-0.15
XR
-0.14
fram
-0.14
Idx
-0.13
rang
-0.13
Bernie
-0.13
/@
-0.13
ças
-0.13
cad
-0.13
REAT
-0.13
POSITIVE LOGITS
like
0.26
Like
0.20
.like
0.19
LIKE
0.19
Like
0.19
LIKE
0.17
seperti
0.17
concept
0.16
_like
0.16
kind
0.15
Activations Density 0.002%