INDEX
Explanations
web-related tags and codes
certain HTML tag structures or web-related syntax
New Auto-Interp
Negative Logits
ICAN
-0.76
Turing
-0.74
Brotherhood
-0.68
Remastered
-0.67
WARE
-0.66
Standing
-0.66
conduc
-0.65
Reloaded
-0.65
abstinence
-0.65
Carnage
-0.65
POSITIVE LOGITS
url
1.01
article
0.98
content
0.94
image
0.93
title
0.92
ads
0.92
description
0.91
inline
0.91
html
0.90
thumbnails
0.90
Activations Density 0.061%