INDEX
Explanations
expressions of personal emotions and introspection
New Auto-Interp
Negative Logits
yg
-0.79
ppl
-0.69
sth
-0.69
perciò
-0.66
Pls
-0.63
např
-0.60
Pls
-0.60
govt
-0.56
abt
-0.56
(=
-0.55
POSITIVE LOGITS
Tikang
0.73
TagMode
0.66
pylint
0.59
ロウィン
0.59
しっかりと
0.59
нгред
0.58
impacting
0.57
>",
0.56
+:+
0.55
imagui
0.55
Activations Density 0.361%