INDEX
Explanations
opinions expressed in text
expressions of personal opinions or judgments
New Auto-Interp
Negative Logits
perty
-0.73
Stuff
-0.69
Yourself
-0.61
ette
-0.60
ufact
-0.60
Creature
-0.60
Written
-0.59
vez
-0.57
arsity
-0.57
Secrets
-0.56
POSITIVE LOGITS
phas
0.72
anyways
0.66
anyway
0.65
é¾įåĸļ士
0.63
terminating
0.61
unres
0.59
euth
0.59
boils
0.58
iewicz
0.58
âĶĢâĶĢâĶĢâĶĢ
0.57
Activations Density 0.060%