INDEX
Explanations
names and details of specific individuals
proper nouns and names associated with individuals or entities
New Auto-Interp
Negative Logits
disposable
-0.56
behav
-0.56
conformity
-0.53
favourable
-0.53
conduc
-0.48
vap
-0.48
Tokens
-0.47
triv
-0.47
craving
-0.46
'."
-0.46
POSITIVE LOGITS
itars
0.63
anky
0.59
itone
0.58
acus
0.53
cussion
0.53
igl
0.52
endon
0.51
Managing
0.51
jon
0.50
Photographer
0.49
Activations Density 1.295%