INDEX
Explanations
phrases indicating a need for specific actions or items
statements indicating requirements or necessities
New Auto-Interp
Negative Logits
ð
-0.63
bridge
-0.58
Ń·
-0.56
Initialized
-0.56
resp
-0.55
Seym
-0.54
riage
-0.53
atform
-0.53
threat
-0.51
berra
-0.51
POSITIVE LOGITS
to
1.06
lessly
1.02
n
0.86
to
0.85
permission
0.77
patience
0.71
TO
0.67
assurances
0.66
access
0.65
authorization
0.64
Activations Density 0.065%