INDEX
Explanations
comments and replies in a text
instances of comments and responses in text
New Auto-Interp
Negative Logits
tto
-0.74
kinson
-0.72
crow
-0.69
metry
-0.68
teenth
-0.67
isSpecialOrderable
-0.66
Takeru
-0.65
lap
-0.64
teasing
-0.64
celebrations
-0.63
POSITIVE LOGITS
ing
1.10
ership
1.05
Reply
1.00
erate
0.89
Comment
0.87
estamp
0.86
edIn
0.86
Reader
0.84
edin
0.84
ists
0.83
Activations Density 0.028%