INDEX
Explanations
repeated mentions of specific names, likely related to a central character or figure in the context
New Auto-Interp
Negative Logits
arist
-0.83
addons
-0.78
resses
-0.71
adr
-0.69
arians
-0.69
ournals
-0.67
illance
-0.67
ributes
-0.67
romy
-0.66
saf
-0.66
POSITIVE LOGITS
Holt
1.05
Kinn
0.81
Lester
0.80
Browne
0.78
Rouge
0.76
Pod
0.73
Madd
0.72
Hodg
0.72
Kers
0.70
Cain
0.70
Activations Density 0.004%