INDEX
Explanations
references to different types of vehicles, specifically SUVs
references to vehicles, particularly SUVs and similar models
New Auto-Interp
Negative Logits
Napoleon
-0.66
dem
-0.64
criminal
-0.63
Trace
-0.62
wrench
-0.62
artist
-0.62
bond
-0.62
prejudice
-0.61
memory
-0.61
Finger
-0.58
POSITIVE LOGITS
Vs
4.44
VS
1.73
vs
1.57
Cs
1.41
Bs
1.33
V
1.31
Vs
1.31
Ps
1.31
Fs
1.28
Rs
1.28
Activations Density 0.009%