INDEX
Explanations
references to self-reflection and self-awareness
New Auto-Interp
Negative Logits
()]);
-0.78
)))));
-0.78
AndEndTag
-0.76
]");
-0.74
"))
-0.73
]$}
-0.72
.'</
-0.71
()].
-0.71
)”.
-0.70
()))
-0.70
POSITIVE LOGITS
UserScript
0.67
mentales
0.47
cervello
0.47
себе
0.47
jegy
0.45
menahan
0.45
strokeStyle
0.42
saira
0.41
Yourself
0.41
vorgenommen
0.39
Activations Density 0.148%