INDEX
Explanations
terms related to political and social issues
terms related to various organizations, environmental issues, and societal themes
New Auto-Interp
Negative Logits
akable
-0.63
bidden
-0.58
ighty
-0.56
nesty
-0.56
frail
-0.54
tolerated
-0.53
juven
-0.53
colo
-0.53
Salvador
-0.53
ainers
-0.52
POSITIVE LOGITS
themed
0.85
pics
0.72
satire
0.69
themed
0.69
subreddits
0.68
scenes
0.66
stories
0.66
vs
0.65
tale
0.62
piss
0.61
Activations Density 0.853%