INDEX
Explanations
phrases or words related to proof or evidence
references to proof or validation of claims or assertions
New Auto-Interp
Negative Logits
livest
-0.84
ecycle
-0.71
contrace
-0.67
Peninsula
-0.65
cffffcc
-0.64
kson
-0.64
ideshow
-0.64
pta
-0.63
tyr
-0.63
newsletters
-0.63
POSITIVE LOGITS
reading
1.35
reader
1.22
read
1.14
proofs
0.85
proof
0.82
proof
0.78
positive
0.77
ificate
0.77
uers
0.77
pudding
0.77
Activations Density 0.025%