INDEX
Explanations
exclamatory phrases containing names
character names and exclamations within conversational dialogue
New Auto-Interp
Negative Logits
anecd
-0.71
clus
-0.70
exerc
-0.65
NCT
-0.64
coasts
-0.64
breweries
-0.64
statically
-0.64
subsid
-0.63
resorts
-0.63
NYT
-0.62
POSITIVE LOGITS
-"
1.02
!?"
1.01
?!"
0.91
âĶĢâĶĢ
0.89
â̦"
0.89
stop
0.89
!?
0.89
kun
0.88
ãĢį
0.87
Stop
0.86
Activations Density 0.284%