INDEX
Explanations
names of political figures and locations
references to individuals with the name "Muammar" or similar-sounding names
New Auto-Interp
Negative Logits
stakes
-0.68
rake
-0.67
FSA
-0.65
Perspective
-0.60
Brawl
-0.59
stru
-0.58
RELEASE
-0.58
revolutions
-0.57
ILCS
-0.57
fuse
-0.56
POSITIVE LOGITS
vu
0.87
gettable
0.79
friend
0.75
edu
0.75
enum
0.74
atar
0.74
export
0.73
riend
0.73
anne
0.71
jen
0.71
Activations Density 0.164%