INDEX
Explanations
phrases indicating the need for confirmation or validation
possessive pronouns and references to personal identity or membership
New Auto-Interp
Negative Logits
Shutterstock
-0.80
backs
-0.74
soever
-0.69
Springer
-0.69
hoops
-0.68
back
-0.67
paced
-0.67
bourg
-0.66
erton
-0.66
Canaver
-0.65
POSITIVE LOGITS
existence
1.44
authenticity
1.37
validity
1.33
legitimacy
1.22
innocence
1.17
worthiness
1.04
presence
1.01
iability
0.96
importance
0.96
accuracy
0.96
Activations Density 0.247%