INDEX
Explanations
phrases related to statements made by specific individuals
colons followed by statements or claims
New Auto-Interp
Negative Logits
tremend
-0.90
inement
-0.73
behavi
-0.73
ilater
-0.72
undai
-0.72
pursu
-0.72
behav
-0.70
peac
-0.68
inarily
-0.67
recip
-0.67
POSITIVE LOGITS
Logged
0.93
TBD
0.87
Provided
0.84
Cosponsors
0.80
Yeah
0.77
Who
0.75
ËĪ
0.75
Rise
0.74
Prelude
0.73
Played
0.71
Activations Density 0.122%