INDEX
Explanations
mentions of records or albums
references to music albums, specifically vinyl records
New Auto-Interp
Negative Logits
flies
-0.86
abad
-0.84
âĶģ
-0.73
ãĤ¨ãĥ«
-0.73
bang
-0.73
Occupations
-0.72
д
-0.70
à¨
-0.69
================================================================
-0.68
ãĥ¼ãĥĨ
-0.67
POSITIVE LOGITS
rint
1.04
olicy
0.99
ropri
0.97
arser
0.94
oyd
0.93
JV
0.89
FP
0.89
LP
0.85
VO
0.84
VT
0.82
Activations Density 0.010%