INDEX
Explanations
words related to specific locations or entities such as "Atelier" and "Hamas" sitting at the top of activations
words related to artistic and creative activities
New Auto-Interp
Negative Logits
Weasley
-0.60
Cinderella
-0.60
winters
-0.59
Cosmos
-0.59
Ghostbusters
-0.58
verb
-0.57
Chao
-0.56
ruary
-0.56
glomer
-0.56
towels
-0.56
POSITIVE LOGITS
mast
1.04
antage
0.78
mast
0.76
elia
0.74
llah
0.73
obl
0.72
rophe
0.70
rius
0.69
oma
0.69
igible
0.69
Activations Density 0.055%