INDEX
Explanations
instances of strong opinions or compelling arguments in texts
New Auto-Interp
Negative Logits
Uncategorized
-0.17
bject
-0.15
ëį°ìĿ´íĬ¸
-0.14
ëĦ¤ìĿ´íĬ¸
-0.14
deaux
-0.14
ekim
-0.13
buat
-0.13
ãĥ»ãĥ»ãĥ»↵↵
-0.13
engin
-0.13
erif
-0.13
POSITIVE LOGITS
[â̦]
0.30
[â̦
0.28
[,]
0.26
[=
0.24
[s
0.24
[_
0.21
[
0.21
[â̦]↵
0.20
[...]
0.20
[`
0.20
Activations Density 6.970%