INDEX
Explanations
mentions of people's names
references to specific individuals and locations
New Auto-Interp
Negative Logits
oho
-0.92
holes
-0.89
hole
-0.88
icals
-0.85
alez
-0.85
MpServer
-0.83
ache
-0.78
neys
-0.77
lon
-0.75
sea
-0.73
POSITIVE LOGITS
Michele
0.98
afia
0.87
Giul
0.74
Vie
0.73
Sabb
0.71
Seym
0.69
aeda
0.69
Schmidt
0.67
Mü
0.66
Modest
0.66
Activations Density 0.019%