INDEX
Explanations
significant verbs and terms indicating quality or status in discussions
New Auto-Interp
Negative Logits
kowski
-0.15
enders
-0.14
record
-0.14
_MODIFIED
-0.14
ieties
-0.14
227
-0.13
ufen
-0.13
xdb
-0.13
UNK
-0.13
Cal
-0.13
POSITIVE LOGITS
'gc
0.17
Comparison
0.15
ouis
0.15
Compare
0.15
comparison
0.15
compare
0.15
.compare
0.14
comparison
0.14
arten
0.14
comparisons
0.14
Activations Density 0.001%