INDEX
Explanations
phrases related to declaring beliefs or stances
terms related to declarations or statements of belief and profession
New Auto-Interp
Negative Logits
displayText
-0.70
inch
-0.64
captcha
-0.63
cooker
-0.63
punch
-0.63
Sahara
-0.62
sled
-0.61
gripping
-0.61
razil
-0.61
crest
-0.59
POSITIVE LOGITS
orial
1.24
orship
1.17
edly
1.07
edIn
1.06
es
0.93
profess
0.92
ational
0.90
onte
0.87
ial
0.85
ed
0.85
Activations Density 0.025%