INDEX
Explanations
Unrelated and random text snippets
occurrences of the term 'Un' followed by a number indicating censorship or uncredited sources
New Auto-Interp
Negative Logits
OPLE
-0.96
briefs
-0.74
cone
-0.71
tomat
-0.69
anwhile
-0.69
代
-0.65
azine
-0.65
Peb
-0.65
=-=-=-=-=-=-=-=-
-0.64
slopes
-0.64
POSITIVE LOGITS
idad
0.99
cles
0.97
ortunately
0.94
affiliated
0.90
classified
0.87
ities
0.86
stable
0.86
usual
0.85
Un
0.85
ruly
0.84
Activations Density 0.006%