INDEX
Explanations
phrases related to belief or disbelief
expressions of belief or conviction
New Auto-Interp
Negative Logits
conservancy
-0.77
ague
-0.69
agher
-0.67
cloth
-0.66
mentioned
-0.66
apeake
-0.65
nec
-0.65
practice
-0.64
aucas
-0.64
aste
-0.63
POSITIVE LOGITS
ieve
0.80
fulness
0.78
believe
0.74
rill
0.74
believes
0.72
phas
0.72
orean
0.71
ership
0.70
believing
0.70
BEL
0.70
Activations Density 0.040%