INDEX
Explanations
references to negative impacts or consequences
references to the concept of "toll" and its associated impacts
New Auto-Interp
Negative Logits
itals
-0.81
Correspond
-0.75
//[
-0.69
zsche
-0.66
Sketch
-0.66
Libre
-0.64
aple
-0.64
RGB
-0.62
ITY
-0.62
xual
-0.62
POSITIVE LOGITS
toll
1.06
booths
0.88
Dickinson
0.82
booth
0.79
Toll
0.79
gate
0.78
sylvania
0.77
plaza
0.74
burden
0.72
s
0.72
Activations Density 0.024%