INDEX
Explanations
words related to confidence and assurance
phrases related to confidence and trust levels
New Auto-Interp
Negative Logits
pmwiki
-0.91
sites
-0.86
HEAD
-0.77
Vert
-0.74
shows
-0.73
xon
-0.71
artifacts
-0.67
mentioned
-0.67
Kin
-0.66
apple
-0.66
POSITIVE LOGITS
worthiness
1.06
confidence
0.92
assurance
0.90
intervals
0.87
iliate
0.80
dividend
0.73
confident
0.73
interval
0.72
worthy
0.72
irming
0.71
Activations Density 0.019%