INDEX
Explanations
evaluations of value and quality in various contexts
New Auto-Interp
Negative Logits
chaft
-0.15
aname
-0.15
verbs
-0.14
lings
-0.14
asan
-0.14
ãĥ¼ãĥ
-0.14
Ort
-0.14
Friedman
-0.13
scoped
-0.13
bluff
-0.13
POSITIVE LOGITS
mi
0.17
-middle
0.14
yal
0.14
ogra
0.14
Ïįν
0.14
noch
0.14
ewhat
0.14
ä½Ĩæĺ¯
0.13
ild
0.13
edla
0.13
Activations Density 0.203%