INDEX
Explanations
references to historical figures or famous names
references to historical figures and themes related to technology and counterculture
New Auto-Interp
Negative Logits
ravel
-0.84
rane
-0.73
teen
-0.73
aneers
-0.71
fare
-0.70
abil
-0.69
stood
-0.67
crew
-0.65
orical
-0.64
meal
-0.64
POSITIVE LOGITS
éĹĺ
0.78
hiba
0.76
itsch
0.76
Mandela
0.74
ulla
0.70
atts
0.70
frey
0.69
ouses
0.68
vernment
0.67
ources
0.67
Activations Density 0.085%