INDEX
Explanations
personal details or characteristics of individuals
pronouns and indicative verbs suggesting individuals and their actions
New Auto-Interp
Negative Logits
Adjust
-0.69
mast
-0.69
cible
-0.63
grid
-0.63
verse
-0.63
sweet
-0.61
pract
-0.60
Repeat
-0.59
ventory
-0.59
Herm
-0.59
POSITIVE LOGITS
reditary
1.00
pherd
0.84
enegger
0.79
zek
0.77
'll
0.75
specialized
0.74
wrote
0.74
lain
0.74
specializes
0.72
chairs
0.72
Activations Density 0.331%