INDEX
Explanations
mentions of intellectual property
references to intellectual property
New Auto-Interp
Negative Logits
aver
-0.94
avers
-0.86
decorated
-0.68
Patch
-0.67
OTA
-0.66
annis
-0.66
GV
-0.64
ICA
-0.63
drops
-0.63
abb
-0.63
POSITIVE LOGITS
ellectual
1.30
intellectual
1.20
curiosity
0.86
dishon
0.85
minds
0.79
arte
0.77
itialized
0.76
philos
0.76
ademic
0.75
minded
0.74
Activations Density 0.008%