INDEX
Explanations
mentions of physical injuries or dislocations
instances of the word "dislocation" and its variations
New Auto-Interp
Negative Logits
til
-0.84
tilt
-0.80
forward
-0.70
][
-0.65
Hof
-0.63
Feinstein
-0.62
scan
-0.62
Apex
-0.61
iator
-0.61
focus
-0.61
POSITIVE LOGITS
disl
3.33
dislike
1.02
article
0.93
tre
0.92
disliked
0.91
tale
0.91
oppos
0.89
Yellow
0.89
pir
0.83
misses
0.82
Activations Density 0.021%