INDEX
Explanations
instances of brackets and punctuation
New Auto-Interp
Negative Logits
/
-0.54
:
-0.52
・
-0.52
/
-0.45
D
-0.43
様な
-0.43
;
-0.42
,
-0.42
стра
-0.42
argent
-0.41
POSITIVE LOGITS
and
1.64
etc
1.20
그리고
1.12
그리고
1.09
ועוד
1.04
そして
0.98
etc
0.94
reszcie
0.92
usw
0.91
Etc
0.91
Activations Density 0.568%