INDEX
Explanations
proper names or names of people
specific pronouns and references to individuals in the text
New Auto-Interp
Negative Logits
Seah
-0.71
Seat
-0.70
hap
-0.69
advertisement
-0.64
ARA
-0.61
Ballard
-0.61
halftime
-0.59
Palest
-0.59
detail
-0.57
icative
-0.57
POSITIVE LOGITS
wont
0.99
'll
0.91
cant
0.86
should
0.86
didnt
0.85
would
0.82
izons
0.81
could
0.81
doesnt
0.81
will
0.81
Activations Density 0.264%