INDEX
Explanations
expressions of opinion or personal belief
New Auto-Interp
Negative Logits
-0.52
-0.49
-------
-0.47
snippetHide
-0.47
клопе
-0.47
WebVitals
-0.46
béco
-0.46
ValueStyle
-0.45
Condol
-0.45
Diweddarwch
-0.45
POSITIVE LOGITS
had
0.60
gave
0.53
have
0.52
took
0.50
wrote
0.49
grew
0.48
Italij
0.47
ambién
0.46
did
0.45
primarily
0.43
Activations Density 0.422%