INDEX
Explanations
repeated mentions of specific individuals, particularly those named Nadine and Nate
New Auto-Interp
Negative Logits
Burnett
-0.85
Burns
-0.75
Giles
-0.72
agger
-0.71
Rapids
-0.71
ston
-0.70
Flower
-0.70
Champ
-0.69
Cast
-0.68
Stall
-0.68
POSITIVE LOGITS
Nad
1.92
Nu
1.69
Ni
1.68
N
1.54
NM
1.53
NH
1.49
Nun
1.47
NM
1.47
ND
1.47
N
1.46
Activations Density 0.066%