INDEX
Explanations
prompts asking the reader to verify that they are not a robot
New Auto-Interp
Negative Logits
76561
-0.71
itar
-0.67
Poc
-0.63
Sapp
-0.60
Brun
-0.59
Bethesda
-0.59
Brist
-0.59
talk
-0.58
ARC
-0.57
Carib
-0.57
POSITIVE LOGITS
verify
0.90
asse
0.78
%]
0.76
enable
0.73
Subscribe
0.72
arming
0.69
ete
0.67
Sign
0.66
SIGN
0.64
subscribe
0.63
Activations Density 0.011%