INDEX
Explanations
structures or patterns in the text, particularly repetitive elements
New Auto-Interp
Negative Logits
']}
-0.74
***!
-0.74
})}
-0.73
))){-0.71
]),
-0.70
']:
-0.70
NewUrlParser
-0.69
'],
-0.67
}}/>
-0.67
']>;
-0.66
POSITIVE LOGITS
----------------
3.52
---------------
2.14
--------------
1.65
--------
1.64
------------
1.61
-----------
1.59
-------------
1.56
-------
1.55
----------
1.51
---------
1.49
Activations Density 0.242%