INDEX
Explanations
specific names, particularly the name "Amanda"
references to specific individuals, particularly those named Amanda
New Auto-Interp
Negative Logits
^^^^
-0.85
folk
-0.80
ÃŁ
-0.78
glers
-0.77
lda
-0.75
book
-0.72
^^
-0.68
ģĸ
-0.68
veland
-0.67
spr
-0.67
POSITIVE LOGITS
anth
0.83
ortion
0.75
atech
0.71
catentry
0.69
adic
0.69
Bio
0.69
atos
0.68
rophe
0.66
cia
0.66
algia
0.65
Activations Density 0.052%