INDEX
Explanations
words related to opinions or viewpoints
phrases relating to perceptions or judgments about entities or concepts
New Auto-Interp
Negative Logits
plan
-0.68
ften
-0.66
vous
-0.64
backer
-0.64
mouth
-0.61
afore
-0.61
fts
-0.60
load
-0.59
nown
-0.57
Frie
-0.57
POSITIVE LOGITS
favorably
0.93
phas
0.82
wcsstore
0.77
ById
0.75
ĸ
0.73
opian
0.72
æĦ
0.72
ibly
0.68
awed
0.68
negatively
0.68
Activations Density 0.055%