INDEX
Explanations
words related to perception or viewpoint
New Auto-Interp
Negative Logits
abouts
-0.70
bda
-0.69
externalActionCode
-0.68
confir
-0.68
isphere
-0.62
sonian
-0.61
intage
-0.59
osure
-0.59
scl
-0.59
oiler
-0.58
POSITIVE LOGITS
phas
1.03
enance
0.91
themselves
0.91
himself
0.82
them
0.82
ourselves
0.81
him
0.80
myself
0.77
it
0.76
homosexuality
0.72
Activations Density 0.245%