INDEX
Explanations
pronouns or nouns referring to people or entities
pronouns indicating agency and their associations
New Auto-Interp
Negative Logits
erno
-0.61
uesday
-0.58
requires
-0.54
ilings
-0.54
complaining
-0.54
Sax
-0.53
raid
-0.53
announcing
-0.53
LLOW
-0.50
arling
-0.50
POSITIVE LOGITS
to
1.04
access
1.00
unrestricted
0.98
unlimited
0.97
flexibility
0.95
limitless
0.80
freely
0.79
freedom
0.78
to
0.77
ample
0.73
Activations Density 0.160%