INDEX
Explanations
concerns or expressions of worry within texts
references to worries or issues
New Auto-Interp
Negative Logits
nice
-0.71
ctors
-0.71
gall
-0.69
buff
-0.67
graph
-0.64
SW
-0.62
INAL
-0.62
cle
-0.61
Interview
-0.60
tiny
-0.60
POSITIVE LOGITS
afety
1.06
concerns
0.91
regarding
0.85
Concern
0.81
raised
0.80
pertaining
0.79
relating
0.78
hooting
0.78
concern
0.78
arising
0.76
Activations Density 0.030%