INDEX
Explanations
punctuation marks in the format of parenthesis with a number inside
list items or instructions
New Auto-Interp
Negative Logits
advertis
-0.68
grocer
-0.65
agall
-0.65
rigging
-0.62
nam
-0.62
rentices
-0.61
buck
-0.60
grooming
-0.59
bos
-0.59
jug
-0.59
POSITIVE LOGITS
[+
0.99
NK
0.68
eous
0.67
Pg
0.64
º
0.64
ãĤ´ãĥ³
0.63
????
0.63
Whilst
0.63
Prot
0.62
%:
0.62
Activations Density 0.023%