INDEX
Explanations
mentions of attention being brought to specific subjects or topics
topics related to legal articles and discussions
New Auto-Interp
Negative Logits
separation
-0.74
timetable
-0.69
ual
-0.68
process
-0.63
termination
-0.61
iew
-0.60
validity
-0.60
separ
-0.59
workload
-0.59
accur
-0.58
POSITIVE LOGITS
another
0.81
interstitial
0.80
features
0.75
saf
0.72
vertisement
0.72
another
0.71
senal
0.66
Spotlight
0.65
includes
0.65
Exhibit
0.64
Activations Density 1.420%