INDEX
Explanations
phrases related to uncertainty or questions
textual elements related to legal proceedings or courtroom dialogue
New Auto-Interp
Negative Logits
centrally
-0.81
etheless
-0.76
estranged
-0.72
vit
-0.67
altru
-0.65
este
-0.65
stad
-0.64
colleg
-0.64
camoufl
-0.64
territ
-0.63
POSITIVE LOGITS
Pause
1.00
Shock
0.92
Reply
0.88
Narr
0.87
DAQ
0.86
Said
0.83
Movie
0.83
Episode
0.82
IRC
0.80
Response
0.80
Activations Density 0.158%