INDEX
Explanations
punctuation marks and formatting within the text
New Auto-Interp
Negative Logits
actionDate
-0.14
ovaly
-0.13
quartered
-0.13
orchestr
-0.12
ãĥ«ãĥķ
-0.12
altar
-0.12
ertil
-0.12
-Clause
-0.12
rupt
-0.12
ascade
-0.12
POSITIVE LOGITS
0.37
Tumblr
0.37
0.36
0.36
0.35
0.34
YouTube
0.34
0.32
Youtube
0.31
0.31
Activations Density 0.326%