INDEX
Explanations
names of individuals with varying levels of activation
names of key individuals mentioned in the document
New Auto-Interp
Negative Logits
ModLoader
-0.72
âĶĢâĶĢ
-0.61
theless
-0.57
Fancy
-0.54
acebook
-0.53
ãĥŁ
-0.53
Madame
-0.52
âĸ¬
-0.51
sburgh
-0.50
Oprah
-0.50
POSITIVE LOGITS
erman
0.70
gaard
0.70
zinski
0.70
beck
0.67
linger
0.67
zen
0.66
zynski
0.66
lett
0.65
stad
0.65
burn
0.64
Activations Density 0.247%