INDEX
Explanations
references to the audience and their reactions
New Auto-Interp
Negative Logits
fl
-0.66
Bly
-0.64
socialista
-0.63
ら
-0.62
er
-0.62
ad
-0.61
Fl
-0.60
ล
-0.60
l
-0.60
1
-0.60
POSITIVE LOGITS
audience
1.47
Audience
1.44
audience
1.44
Audience
1.35
audiences
1.35
Audiences
1.05
―――――
1.02
AUD
1.02
publiek
0.99
Audi
0.99
Activations Density 0.052%